SlideShare a Scribd company logo
Advanced Mean Field Methods Theory And Practice
Manfred Opper David Saad download
https://ebookbell.com/product/advanced-mean-field-methods-theory-
and-practice-manfred-opper-david-saad-56636516
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Advances In Dynamic And Mean Field Games Theory Applications And
Numerical Methods 1st Edition Joseph Apaloo
https://ebookbell.com/product/advances-in-dynamic-and-mean-field-
games-theory-applications-and-numerical-methods-1st-edition-joseph-
apaloo-6842470
Mean Field Models For Spin Glasses Volume Ii Advanced Replicasymmetry
And Low Temperature 2nd Edition Michel Talagrand Auth
https://ebookbell.com/product/mean-field-models-for-spin-glasses-
volume-ii-advanced-replicasymmetry-and-low-temperature-2nd-edition-
michel-talagrand-auth-2451920
Complete Minna No Nihongo All Textbooks And Workbooks With All Audio
Sound Files Everything Pack Beginner Advanced Intermediate Chokyu
Shokyu For Japanese Learners I Mean Not Learners Who Are Japanese But
Learning Japanese Phew
https://ebookbell.com/product/complete-minna-no-nihongo-all-textbooks-
and-workbooks-with-all-audio-sound-files-everything-pack-beginner-
advanced-intermediate-chokyu-shokyu-for-japanese-learners-i-mean-not-
learners-who-are-japanese-but-learning-japanese-phew-50883074
Advanced Manmachine Interaction Fundamentals And Implementation
Signals And Communication Technology 1st Edition Karlfriedrich Kraiss
Editor
https://ebookbell.com/product/advanced-manmachine-interaction-
fundamentals-and-implementation-signals-and-communication-
technology-1st-edition-karlfriedrich-kraiss-editor-2539118
Fiddle Studio Book 2 Fiddling For The Advanced Beginner Megan Beller
Beller
https://ebookbell.com/product/fiddle-studio-book-2-fiddling-for-the-
advanced-beginner-megan-beller-beller-26380280
Advanced Technologies For Meat Processing Second Edition 2nd Ed Fidel
Toldr
https://ebookbell.com/product/advanced-technologies-for-meat-
processing-second-edition-2nd-ed-fidel-toldr-6837276
Advanced Technologies For Meat Processing 1st Edition Leo Ml Nollet
https://ebookbell.com/product/advanced-technologies-for-meat-
processing-1st-edition-leo-ml-nollet-1199874
Rational Design Of Nonprecious Metal Oxide Catalysts By Means Of
Advanced Synthetic And Promotional Routes Konsolakis M
https://ebookbell.com/product/rational-design-of-nonprecious-metal-
oxide-catalysts-by-means-of-advanced-synthetic-and-promotional-routes-
konsolakis-m-48868452
Rational Design Of Nonprecious Metal Oxide Catalysts By Means Of
Advanced Synthetic And Promotional Routes Michalis Konsolakis
https://ebookbell.com/product/rational-design-of-nonprecious-metal-
oxide-catalysts-by-means-of-advanced-synthetic-and-promotional-routes-
michalis-konsolakis-54700876
Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad
Advanced Mean Field Methods
Neural Information Processing Series
Michael I. Jordan, Sara I. Solla
Advances in Large Margin Classifiers
Alexander J. Smola, Peter 1. Bartlett, Bernhard Scholkkopf, and Dale Schuurmans,
eds., 2000
Advanced Mean Field Methods: Theory and Practice
Manfred Opper and David Saad, eds., 2001
Advanced Mean Field Methods
Theory and Practice
Edited by
Manfred Opper and David Saad
The MIT Press
Cambridge, Massachusetts
London, England
© 2001 Massachusetts Institute of Technology
All rights reserved. No part of this book may be reproduced in any form by any electronic
or mechanical means (including photocopying, recording, or information storage and retrieval)
without permission in writing from the publisher.
Library of Congress Cataloging-in-Publication Data
Advanced mean field methods : theory and practice/edited by
Manfred Opper and David Saad
p. cm.-(Neural Information Processing Series)
Includes bibliographical references.
ISBN 0-262-15054-9 (alk. paper)
I. Mean field theory
II. Opper, Manfred. III. Saad, David.
QC174.85.M43 A38 2001
530.15'95-dc21 00-053322
CONTENTS
Series Foreword
Foreword
Contributors
Acknowledgments
1
2
3
4
Introduction
Manfred Opper and David Saad
From Naive Mean Field Theory to the TAP Equations
Manfred Opper and Ole Winther
An Idiosyncratic Journey Beyond Mean Field Theory
Jonathan S. Yedidia
Mean Field Theory for Graphical Models
Hilbert J. Kappen and Wim J. Wiegerinck
5 The TAP Approach to Intensive and Extensive
6
7
Connectivity Systems
Yoshiyuki Kabashima and David Saad
TAP For Parity Check Error Correcting Codes
David Saad, Yoshiyuki Kabashima and Renato Vicente
Adaptive TAP Equations
Manfred Opper and Ole Winther
8 Mean-field Theory of Learning: From Dynamics to
9
Statics
K. Y. Michael Wong, S. Li and Peixun Luo
Saddle-point Methods for Intractable Graphical
Models
Fernando J. Pineda, Cheryl Resch and I-Jeng Wang
10 Tutorial on Variational Approximation Methods
Tommi S. Jaakkola
11 Graphical Models and Variational Methods
Zoubin Ghahramani and Matthew J. Beal
12 Some Examples of Recursive Variational
Approximations for Bayesian Inference
K. Humphreys and D.M. Titterington
13 Tractable Approximate Belief Propagation
David Barber
vii
viii
xi
xiv
1
7
21
37
51
67
85
99
119
129
161
179
197
vi Contents
14 The Attenuated Max-Product Algorithm 213
Brendan J. Frey and Ralf Koetter
15 Comparing the Mean Field Method and Belief
Propagation for Approximate Inference in MRFs 229
Yair Weiss
16 Information Geometry of a-Projection in Mean Field
Approximation 241
Shun-ichi Amari, Shiro Ikeda and Hidetoshi Shimokawa
17 Information Geometry of Mean-Field Approximation 259
Toshiyuki Tanaka
SERIES FOREWORD
The yearly Neural Information Processing Systems (NIPS) workshops bring to­
gether scientists wirh broadly varying backgrounds in statistics, mathematics, com­
puter science, physics, electrical engineering, neuroscience, and cognitive science,
unified by a common desire to develop novel computational and statistical strate­
gies for information processing, and to understand the mechanisms for information
processing in the brain. As opposed to conferences, these workshops maintain a
flexible format that both allows and encourages the presentation and discussion of
work in progress, and thus serve as an incubator for the development of important
new ideas in this rapidly evolving field.
The Series Editors, in consultation with workshop organizers and members of
the NIPS Foundation Board, select specific workshop topics on the basis of sci­
entific excellence, intellectual breadth, and technical impact. Collections of papers
chosen and edited by the organizers of specific workshops are build around ped­
agogical introductory chapters, while research monographs provide comprehensive
descriptions of workshop-related topics, to create a series of books that provides
a timely, authoritative account of the latest developments in the exciting field of
neural computation.
Michael I. Jordan, Sara I. Solla
FOREWORD
The links between statistical physics and the information sciences-including com­
puter science, statistics, and communication theory-have grown stronger in recent
years, as the needs of applications have increasingly led researchers in the informa­
tion sciences towards the study of large-scale, highly-coupled probabilistic systems
that are reminiscent of models in statistical physics. One useful link is the class of
Markov Chain Monte Carlo (MCMC) methods, sampling-based algorithms whose
roots lie in the simulation of gases and condensed matter, but whose appealing gen­
erality and simplicity of implementation have sparked new applications throughout
the information sciences. Another source of links, currently undergoing rapid devel­
opment, is the class of mean-field methods that are the topic of this book. Mean­
field methods aim to solve many of the same problems as are addressed by MCMC
methods, but do so using different conceptual and mathematical tools. Mean-field
methods are deterministic methods, making use of tools such as Taylor expansions
and convex relaxations to approximate or bound quantities of interest. While the
analysis of MCMC methods reposes on the theory of Markov chains and stochastic
matrices, mean-field methods make links to optimization theory and perburbation
theory.
Underlying much of the heightened interest in these links between statistical
physics and the information sciences is the development (in the latter field) of
a general framework for associating joint probability distributions with graphs,
and for exploiting the structure of the graph in the computation of marginal
probabilities and expectations. Probabilistic graphical models are graphs-directed
or undirected-annotated with functions defined on local clusters of nodes that
when taken together define families of joint probability distributions on the graph.
Not only are the classical models of statistical physics instances of graphical models
(generally involving undirected graphs), but many applied probabilistic models with
no obvious connection to physics are graphical models as well-examples include
phylogenetic trees in genetics, diagnostic systems in medicine, unsupervised learning
models in machine learning, and error-control codes in information theory. The
availability of the general framework has made it possible for ideas to flow more
readily between these fields.
In physics one of the principal applications of mean-field methods is the predic­
tion of "phase transitions", discontinuities in aggregate properties of a system under
the scaling of one or more parameters associated with the system. A physicist read­
ing the current book may thus be surprised by the relatively infrequent occurrence
of the term "phase transition". In the applications to the information sciences, it
is often the values of the "microscopic" variables that are of most interest, while
the "macroscopic" properties of the system are often of secondary interest. Thus
in the genetics application we are interested in the genotype of specific individuals;
in the diagnostic applications our interest is in the probability of specific diseases;
and in error-control coding we wish to recover the bits in the transmitted message.
Moreover, in many of these applications we are interested in a specific graph, whose
Foreword ix
parameters are determined by statistical methods, by a domain expert or by a de­
signer, and it is a matter of secondary interest as to how aggregate properties of
the probability distribution would change in some hypothetical alternative graph
in which certain parameters have been scaled.
This is not to say that aggregate properties of probability distributions are not
of interest; indeed they are key to understanding the mean-field approach. The
calculation of the probability distribution of any given "microscopic" variable-the
marginal probability of a node in the graph-is an aggregation operation, requiring
summing or integrating the joint probability with respect to all other variables. In
statistical terms one is calculating a "log likelihood"; the physics terminology is
the "free energy". In the computational framework referred to above one attempts
to exploit the constraints imposed by the graphical structure to compute these
quantities efficiently, essentially using the missing edges in the graph to manage
the proliferation of intermediate terms that arise in computing multiple sums or
integrals. This approach has been successful in many applied problems, principally
involving graphs in the form of trees or chains. For more general graphs, however,
a combinatorial explosion often rises up to slay any attempt to calculate marginal
probabilities exactly.
Unfortunately, it is precisely these graphs that are not in the form of trees and
chains that are on the research frontier in many applied fields. New ideas are needed
to cope with these graphs, and recent empirical results have suggested mean-field
and related methods as candidates.
Mean-field methods take a more numerical approach to calculations in graph­
ical models. There are several ways to understand mean-field methods, and the
current book provides excellent coverage of all of the principal perspectives. One
major theme is that of "relaxation", an idea familiar from modern optimization
theory. Rather than computing a specific probability distribution, one relaxes the
constraints defining the probability distribution, obtaining an optimization problem
in which the solution to the original problem is the (unique) optimum. Relaxing
constraints involves introducing Lagrange multipliers, and algorithms can be de­
veloped in which the original, "primal" problem is solved via "dual" relationships
among the Lagrangian variables.
This optimization perspective is important to understanding the computational
consequences of adopting the physics framework. In particular, in the physics
framework the free energy takes a mathematical form in which constraints are
readily imposed and readily "relaxed". Note also that the physics framework
permits expressing the free energy as the sum of two terms-the "average energy"
and the "entropy". Computational methods can be developed that are geared to
the specific mathematical forms taken by these terms.
The optimization perspective that mean-field theory brings to the table is useful
in another way. In particular, the graphical models studied in the information
sciences are often not fully determined by a prior scientific theory, but are viewed
as statistical models that are to be fit to observed data. Fitting a model to
data generally involves some form of optimization-in the simplest setting one
maximizes the log likelihood with respect to the model parameters. As we have
x Foreword
discussed, the mean-field approach naturally treats the log likelihood (free energy)
as a parameterized function to be optimized, and it might be expected that this
approach would therefore extend readily to likelihood-based statistical methods.
Indeed, the simplest mean-field methods yield a lower bound on the log likelihood,
and one can maximize this lower bound as a surrogate for the (generally intractable)
maximization of the log likelihood.
While all of these arguments may have appeal to the physicist, particularly the
physicist contemplating unemployment in the modern "information economy", for
the information scientist there is room for doubt. A survey of the models studied by
the physicists reveal properties that diverge from the needs of the information scien­
tist. Statistical physical models are often homogeneous-the parameters linking the
nodes are the same everywhere in the graph. More generally, the physical models
choose parameters from distributions ("spin-glass models") but these distributions
are the same everywhere in the graph. The models allow "field terms" that are
equivalent to "observed data" in the statistical setting, but often these field terms
are assumed equal. Various graphical symmetries are often invoked. Some models
assume infinite-ranged connections. All of these assumptions seem rather far from
the highly inhomogeneous, irregular setting of models in settings such as genetics,
medical diagnosis, unsupervised learning or error-control coding.
While it is possible that some of these assumptions are required for mean­
field methods to succeed, there are reasons to believe that the scope of mean-field
methods extends beyond the restrictive physical setting that engendered them.
First, as reported by several of the papers in this volume, there have been a number
of empirical successes involving mean-field methods, in problems far from the
physics setting. Second, many of the assumptions have been imposed with the goal
of obtaining analytical results, particularly as part of the hunt for phase transitions.
Viewed as a computational methodology, mean-field theory may not require such
strong symmetries or homogeneities. Third, there is reason to believe that the
exact calculation techniques and mean-field techniques exploit complementary
aspects of probabilistic graphical model structure, and that hybrid techniques may
allow strong interactions to be removed using exact calculations, revealing more
homogeneous "residuals" that can be handled via mean-field algorithms.
Considerations such as these form the principal subject matter of the book and
are addressed in many of its chapters. While the book does an admirable job of
covering the basics of mean-field theory in the classical setting of Ising and related
models, the main thrust is the detailed consideration of the new links between
computation and general probabilistic modeling that mean-field methods promise
to expose. This is an exciting and timely topic, and the current book provides the
best treatment yet available.
Michael 1. Jordan
Berkeley
CONTRIBUTORS
Shun-Ichi Amari
RIKEN Brain Science Institute,
Hirosawa, 2-1, Wako-shi, Saitama,
351-0198, Japan.
amari@brain.riken.go.jp
David Barber
The Neural Computing
Research Group,
School of Engineering and Applied
Science,
Aston University,
Birmingham B4 7ET, UK.
barberd@aston.ac.uk
Matthew J. Beal
Gatsby Computational Neuroscience
Unit,
University College London,
17 Queen Square,
London WC1N 3AR,
UK.
m.beal@gatsby.ucl.ac.uk
Brendan J. Frey
Computer Science,
University of Waterloo,
Davis Centre,
Waterloo,
Ontario N2L 3G1,
Canada.
frey@uwaterloo.ca
Zoubin Ghahramani
Gatsby Computational Neuroscience
Unit,
University College London,
17 Queen Square,
London WC1N 3AR, UK.
zoubin@gatsby.ucl.ac.uk
Keith Humphreys
Stockholm University/KTH,
Department of Computer and Systems
Sciences,
Electrum 230,
SE-164 40 Kista, Sweden.
keith@dsv.su.se
Shiro Ikeda
PRESTO, JST,
Lab. for Mathematical Neuroscience,
BSI, RIKEN
Hirosawa 2-1, Wako-shi, Saitama,
351-0198 Japan.
Shiro.lkeda@brain.riken.go.jp
Tommi S. Jaakkola
Department of Computer Science and
Electrical Engineering,
Massachusetts Institute of Technology,
Cambridge, MA 02139, USA.
tommi@ai.mit.edu
Yoshiyuki Kabashima
Department of Computational
Intelligence and Systems Science,
Tokyo Institute of Technology,
Yokohama 2268502, Japan.
kaba@dis.titech.ac.jp
Bert Kappen
Foundation for Neural Networks (SNN),
Department of Medical Physics and
Biophysics,
University of Nijmegen,
Geert Grooteplein 21,
CPK1 231,
NL 6525 EZ Nijmegen.
The Netherlands.
bert@mbfys.kun.nl
xii
Ralf Koetter
University of Illinois at
Urbana-Champaign,
115 Computing Systems Research Lab,
1308 W. Main,
Urbana, IL 61801
USA.
koetter@rainier.csl.uiuc.edu
Song Li
Department of Physics,
The Hong Kong University of Science
and Technology,
Clear Water Bay,
Kowloon, Hong Kong.
phlisong@ust.hk
Peixun Luo
Department of Physics,
The Hong Kong University of Science
and Technology,
Clear Water Bay,
Kowloon, Hong Kong.
physlpx@ust.hk
Manfred Opper
The Neural Computing
Research Group,
School of Engineering and Applied
Science,
Aston University,
Birmingham B4 7ET, UK.
opperm@aston.ac.uk
Fernando J. Pineda
Research and Technology Development
Center
The Johns Hopkins University Applied
Physics Laboratory
Johns Hopkins Rd. Laurel,
MD 20723-6099, USA.
fernando.pineda@jhuapl.edu
Contributors
Cheryl Resch
Research and Technology Development
Center
The Johns Hopkins University Applied
Physics Laboratory
Johns Hopkins Rd. Laurel,
MD 20723-6099, USA.
cheryl.resch@jhuapl.edu
David Saad
The Neural Computing
Research Group,
School of Engineering and Applied
Science,
Aston University,
Birmingham B4 7ET, UK.
saadd@aston.ac.uk
Hidetoshi Shimokawa
Faculty of Engineering,
7-3-1, Hongo, Bunkyo-ku,
Tokyo 113-8656,
Japan.
simokawa@sat.t.u-tokyo.ac.jp
D.M. Titterington
Department of Statistics,
University of Glasgow,
Glasgow G12 8QQ,
Scotland, UK.
mike@stats.gla.ac.uk
Toshiyuki Tanaka
Department of Electronics and
Information Engineering,
Faculty of Engineering,
Tokyo Metropolitan University,
Circuits and Systems Engineering
Laboratory,
1-1 Minami Oosawa, Hachioji,
Tokyo, 192-0397
Japan.
tanaka@eei.metro-u.ac.jp
Contributors
Renato Vicente
The Neural Computing
Research Group,
School of Engineering and Applied
Science,
Aston University,
Birmingham B4 7ET, UK.
vicenter@aston.ac.uk
I-Jeng Wang
Research and Technology Development
Center
The Johns Hopkins University Applied
Physics Laboratory
Johns Hopkins Rd. Laurel,
MD 20723-6099, USA.
i-jeng.wang@jhuapl.edu
Yair Weiss
Computer Science Division
UC Berkeley, 485 Soda Hall
Berkeley, CA 94720-1776,
USA.
yweiss@cs.berkeley.edu
Wim Wiegerinck
Foundation for Neural Networks (SNN),
Department of Medical Physics and
Biophysics,
University of Nijmegen,
Geert Grooteplein 21,
CPK1 231,
NL 6525 EZ Nijmegen.
The Netherlands.
wimw@mbfys.kun.nl
Ole Winther
Department of Theoretical Physics,
Lund University,
Slvegatan 14A,
S - 223 62 Lund,
Sweden. winther@thep.lu.se
K. Y. Michael Wong
Department of Physics,
xiii
The Hong Kong University of Science
and Technology,
Clear Water Bay,
Kowloon, Hong Kong.
phkywong@ust.hk
Jonathan S. Yedidia
MERL - Mitsubishi Electric Research
Laboratories, Inc.
201 Broadway, 8th Floor,
Cambridge, MA 02139,
USA.
yedidia@merl.com
ACKNOWLEDGMENTS
We would like to thank Wei Lee Woon for helping us with preparing the manuscript
for publication and the participants of the post-NIPS workshop on Advanced Mean
Field Methods for their contribution to this book.
Finally, we would like to thank Julianne and Christiane, Felix, Jonathan and
Lior for their tolerance during this very busy summer.
1 Introduction
Manfred Opper and David Saad
A major problem in modern probabilistic modeling is the huge computational com­
plexity involved in typical calculations with multivariate probability distributions
when the number of random variables is large.
Take, for instance, probabilistic data models such as Bayesian belief networks
which have found widespread applications in artificial intelligence and neural com­
putation. These models explain observed (visible) data by a set of hidden random
variables using the joint distribution of both sets of variables. Statistical inference
about the unknown hidden variables requires computing their posterior expectation
given the observations. Model selection is often based on maximizing the marginal
distribution of the observed data with respect to the model parameters. Since ex­
act calculation of both quantities becomes infeasible when the number of hidden
variables is large and also Monte Carlo sampling techniques may reach their limits,
there is growing interest in methods which allow for efficient approximations.
One of the simplest and most prominent approximations is based on the so­
called Mean Field (MF) method which has a long history in statistical physics. In
this approach, the mutual influence between random variables is replaced by an
effective field, which acts independently on each random variable. In its simplest
version, this can be formulated as an approximation of the true distribution by a
factorizable one. A variational optimization of such products results in a closed set
of nonlinear equations for their expected values, which usually can be solved in a
time that only grows polynomially in the number of variables.
Presently, there is an increasing research activity aimed at developing improved
approximations which take into account part of the neglected correlations between
random variables, and at exploring novel fields of applications for such advanced
mean field methods.
Significant progress has been made by researchers coming from a variety of
scientific backgrounds like statistical physics, computer science and mathematical
statistics. These fields often differ in their scientific terminologies, intuitions and
biases. For instance, physicists often prefer typically good approximations (with less
clear worst case behavior) over the rigorous results favored by computer scientists.
Since such 'cultural' differences may slow down the exchange of ideas we organized
the NIPS workshop on Advanced Mean Field Methods in 1999 to encourage further
interactions and cross fertilization between fields. The workshop has revealed a
variety of deep connections between the different approaches (like that of the Bethe
approximation and belief propagation techniques) which has already lead to the
development of a novel algorithm.
This book is a collection of the presentations given at the workshop together
with a few other related invited papers. The following problems and questions are
among the central topics discussed in this book:
• Advanced MF approaches like the TAP (Thouless, Anderson, Palmer) method
have been originally derived for very specific models in statistical physics. How
can we expand their theoretical foundations in order to make the methods widely
2 Manfred Opper and David Saad
applicable within the field of probabilistic data models?
• What are the precise relations between the statistical physics approaches and
other methods which have been developed in the computer science community,
like the belief propagation technique? Can we use this knowledge to develop novel
and even more powerful inference techniques by unifying and combining these
approaches?
• The quality of the MF approximation is, in general, unknown. Can we predict
when a specific MF approximation will work better than another? Are there
systematical ways to improve these approximations such that our confidence in
the results will increase?
• What are the promising application areas for advanced mean field approaches and
what are the principled ways of solving the mean field equations when the structure
of the dependencies between random variables is sufficiently complicated?
The chapters of this book can be grouped into two parts. While chapters 2-9
focus mainly on approaches developed in the statistical physics community, chapters
10-17 are more biased towards ideas originated in the computer science/statistics
communities.
Chapters 2 and 3 can serve as introductions to the main ideas behind the
statistical physics approaches. Chapter 2 explains three different types of MF field
approximations, demonstrated on a simple Boltzmann machine like Ising model.
Naive mean field equations are derived by the variational method and by a field
theoretic approach. In the latter, high dimensional sums are transformed into
integrals over auxiliary variables which are approximated by Laplace's method.
The TAP MF equations account for correlations between random variables by an
approximate computation of the reaction of all variables to the deletion of a single
variable from the system.
Chapter 3 explains the role of the statistical physics free energy within the
framework of probabilistic models and shows how different approximations of the
free energy lead to various advanced MF methods. In this way, the naive MF theory,
the TAP approach and the Bethe approximation are derived as the first terms in
two different systematic expansions of the free energy, the Plefka expansion and
the cluster variation method of Kikuchi. Remarkably, the minima of the Bethe
free energy are identified as the fixed points of the celebrated belief propagation
algorithm for inference in graphical models. This connection opens new ways for
systematic improvements of this algorithm.
The following 5 chapters present various generalizations and applications of
TAP-like mean field approaches. A novel derivation of the TAP approximation is
presented in chapter 4. It is based on a truncation of a power series expansion
of marginal distributions with respect to the couplings between random variables.
This derivation opens up new fields of applications for the TAP approach, like to
graphical models with general types of interactions. It also allows to treat stochastic
networks with asymmetric couplings for which a closed form of the stationary
probability distribution is not available. Numerical simulations for graphical models
and comparisons with simple MF theory demonstrate the significance of the TAP
Introduction 3
method.
Chapter 5 addresses the problem of deriving the correct form of the TAP equa­
tions for models where the interactions between random variables have significant
correlations. It also bridges the gap between the TAP approach and the belief prop­
agation method. Demonstrating the method on the Hopfield model, the original set
of random variables is augmented by an auxiliary set such that the mutual depen­
dencies are weak enough to justify a tree approximation. The equations derived for
the corresponding set of conditional probabilities on this tree reduce to the well
known TAP equations for the Hopfield model in the limit of extensive connectivity.
Chapter 6 employs the framework presented in chapter 5 to investigate decoding
techniques within the context of low-density parity-check error-correcting codes. It
shows the similarity in the decoding dynamic, obtained using the TAP approach,
and the method of belief propagation. Numerical experiments examine the efficacy
of the method as a decoding algorithm by comparing the results obtained with the
analytical solutions.
Chapter 7 introduces a method for adapting the TAP approach to a concrete
set of data, providing another answer to the problem raised in chapter 5. The
method avoids the assumptions, usually made in the cavity derivation of the TAP
equations, about the distribution of interactions between random variables. By
using the cavity method together with linear response arguments, an extra set
of data dependent equations for the reaction terms is obtained. Applications of
the adaptive TAP approximation to the Hopfield model as well as to Bayesian
classification are presented.
Chapter 8 presents a TAP - like mean field theory to treat stochastic dynamical
equations. The cavity method is used to derive dynamical mean field equations for
computing the temporal development of averages. The method is applied to the
average case performance of stochastic learning algorithms for neural networks. It
is shown how static averages over the steady state distribution are obtained in the
infinite time limit. The chapter sheds more light on the meaning and on the basic
assumptions behind the cavity approach by showing how the formalism must be
altered in the case of a rugged energy landscape.
Chapter 9 applies the field theoretic mean field approach to computing the
marginal probability of the visible variables in graphical models. In this method,
the relevant random variables are decoupled using auxiliary integration variables.
The summations over the huge number of values of the random variables can now be
performed exactly. The remaining integrals are performed by a quadratic expansion
around the saddlepoint. As shown for two examples of Bayesian belief networks,
this approximation can yield a dramatic improvement over the results, achieved by
applying the variational method using a factorized distribution.
Chapter 10 presents a general introduction to the variational method and its
application to inference in probabilistic models. By reformulating inference tasks
as optimization problems, tractable approximations can be obtained by suitable
restriction of the solution space. The standard MF method is generalized by min­
imizing the Kullback-Leibler divergence using factorized variational distributions,
where each factor contains a tractable substructure of variables. A different way
4 Manfred Opper and David Saad
of decoupling random variables is achieved by using variational transformations
for conditional probabilities. The chapter discusses various fields of applications of
these ideas.
Chapters 11 and 12 discuss applications and modifications of the variational
method for complex probabilistic models with hidden states. Chapter 11 shows
how a factorial approximation to the distribution of hidden states can be used
to obtain a tractable approximation for the E - step of the EM algorithm for
parameters estimation. This idea can be generalized to model estimation in a
Bayesian framework, where a factorization of the joint posterior of parameters and
hidden variables enables an approximate optimization of the Bayesian evidence. The
occurring variational problems can be solved efficiently by a Bayesian generalization
of the EM algorithm for exponential models and their conjugate priors. The method
is demonstrated on mixtures of factor analyzers and state-space models.
Chapter 12 reconsiders the Bayesian inference problem with hidden variables
discussed in the previous chapter. It offers alternative approaches for approximate
factorizations of posterior distributions in cases, when the standard variational
method becomes computationally infeasible. In these recursive procedures, a factor­
ized approximation to the posterior is updated any time a new observation arrives.
A recursive variational optimization is compared with the probabilistic editor which
recursively matches moments of marginal posterior distributions. The Quasi-Bayes
method replaces hidden variables at any update of the posterior by their approxi­
mate posterior expectations based on the already collected data. The probabilistic
editor outperforms the other two strategies in simulations of a toy neural network
and a simple hidden Markov model.
Chapter 13 gives an introduction to belief propagation (BP) for directed and
undirected graphical models. BP is an inference technique which is exact for graphs
with a tree structure. However, the method may become intractable in densely
connected directed graphs. To cope with the computational complexity, an integral
transformation of the intractable sums together with a saddlepoint approximation,
similar to the field theoretic MF approach discussed in chapter 9, is introduced.
Simulations for a graphical model which allows representations by both directed
and undirected graphs, show that the method outperforms a simple variational MF
approximation and undirected BP.
Chapters 14 and 15 investigate the performance of BP inference algorithms
when applied to probabilistic models with loopy graphs. In such a case, exact
inference can no longer be guaranteed. Chapter 14 introduces a modification of the
max-product algorithm designed to compute the maximum posterior probability
(MAP). By properly attenuating the BP messages, the algorithm can properly deal
with the dependencies introduced by the cycles in the graph. It is shown rigorously
for codes on graphs that in this way the exact global MAP configuration of the
random variables is reached, if the algorithm converges. The question of when such
an algorithm converges, remains open.
Also chapter 15 demonstrates the importance of understanding the actual
dynamics of advanced MF inference algorithms. It compares the performance of BP
to the simple MF method on markov random field problems. The fixed points of
Introduction 5
both algorithms coincide with zero gradient solutions of different approximate free
energies (see also chapter 3). For a variety of numerical examples BP outperforms
the simple MF method. Remarkably, one finds that BP often converges to a
configuration which is close to the global minimum of the simple MF free energy,
whereas the simple MF algorithm performs worse by getting trapped in local
minima.
Chapters 16 and 17 conclude the book by discussing mean field approaches
from the viewpoint of the information geometric approach to statistical inference.
Understanding the invariant geometric properties of MF approximations may help
to identify new ways of assessing and improving their accuracy.
Chapter 16 introduces a one parameter family of non-symmetric distance mea­
sures between probability distributions which are demonstrated for the exponential
family of Boltzmann machines. An expansion of these a-divergences for neighboring
distributions involves the Fisher information, which gives the manifold of distribu­
tions a unique invariant metric. Orthogonal projections of a multivariate distribu­
tion onto the manifold of factorized distributions interpolate between the desired
intractable exact marginal distribution (a = -1) for which there is a unique solu­
tion, and the naive MF approximation (a = 1) for which many solutions often exist.
This framework suggests a novel approximation scheme based on an expansion of
the intractable projections in powers of a around the tractable point a = 1. An al­
ternative way to approximate the intractable a projections, based on a power series
expansion in the coupling matrix, leads to a novel derivation of the TAP equations
and their generalization to arbitrary a.
In chapter 17, the ideas of information geometry are shown to provide a
unified treatment of different mean field methods and shed light on the theoretical
basis of the variational approach. Variational derivation of the naive MF method
may be understood as a projection of the true distribution onto the manifold of
factorized distributions. A family of manifolds is introduced which is controlled
by a single parameter that interpolates between the fully factorized distributions
and the manifold of general distributions, which includes the intractable true
distribution. The desired full variation can be approached perturbatively by an
expansion with respect to this parameter. In this way, a new interpretation of
the Plefka expansion for the TAP equations emerges. The geometric approach is
extended to the variational Bayes method and to the variational approximation to
the EM algorithm which is understood as the alternation of two projection types.
This book is aimed at providing a fairly comprehensive overview of recent de­
velopments in the area of advanced mean field theories, examining their theoretical
background, links to other approaches and possible novel applications. The chapters
were designed to contain sufficiently detailed material to enable the non-specialist
reader to follow the main ideas with minimal background reading.
2 From Naive Mean Field Theory to
the TAP Equations
Manfred Opper and Ole Winther
We give a basic introduction to three different MF approaches which
will be discussed on a more advanced level in other chapters of this
book. We discuss the Variational, the Field Theoretic and the TAP
approaches and their applications to a Boltzmann machine type of
Ising model.
1 Introduction
Mean field (MF) methods provide tractable approximations for the computation of
high dimensional sums and integrals in probabilistic models. By neglecting certain
dependencies between random variables, a closed set of equations for the expected
values of these variables is derived which often can be solved in a time that
only grows polynomially in the number of variables. The method has its origin
in Statistical Physics where the thermal fluctuations of particles are governed by
high dimensional probability distributions.
In the field of probabilistic modeling, the MF approximation is often identified as
a special kind of the variational approach in which the true intractable distribution
is approximated by an optimal factorized one. On the other hand, a variety of
other approximations with a "mean field" flavor are known in the Statistical
Physics community. However, compared to the variational approach the derivation
of these other techniques seem to be less "clean". For instance, the "field theoretic"
MF approaches may lack a clearcut probabilistic interpretation because of the
occurrence of auxiliary variables, integrated in the complex plane. Hence, one
is often unable to turn such a method into an exact bound. Nevertheless, as
the different contributions to this book show, the power of non-variational MF
techniques should not be ignored.
This chapter does not aim at presenting any new results but rather tries to
give a basic and brief introduction to three different MF approaches which will be
discussed on a more advanced level in other chapters of this book. These are the
Variational, the Field Theoretic and the TAP approaches. Throughout the chapter,
we will explain the application of these methods for the case of an Ising model (also
known as a Boltzmann machine in the field of Neural Computation).
Our review of MF techniques is far from being exhaustive and we expect that
other methods may play an important role in the future. Readers who want to
learn more about Statistical Physics techniques and the MF method may consult
existing textbooks e.g. [16; 19; 33]. A more thorough explanation of the variational
method and its applications will be given in the chapters [5; 7; 9] of this book. A
somewhat complementary review of advanced MF techniques is presented in the
next chapter [32].
8 Manfred Opper and Ole Winther
2 The Variational Mean Field Method
Perhaps the best known derivation of mean field equations outside the Statistical
Physics community is the one given by the Variational Method. This method
approximates an intractable distribution P(S) of a vector S = (S1, . . . , SN) of
random variables by Q(S) which belongs to a family M of tractable distributions.
The distribution Q is chosen such that it minimizes a certain distance measure
D(Q, P) within the family M.
To enable tractable computations, D(Q, P) is chosen as the relative entropy, or
Kullback-Leibler divergence
KL(QIIP) =
�Q(S) ln ���� =
(In �)Q
' (1)
where the bracket (. . .)Q denotes an expectation with respect to Q. Since KL(QIIP)
is not symmetric in P and Q, one might wonder if KL(PIIQ) would be a better
choice (this question is discussed in the two chapters of [28; 1]). The main reason
for choosing (22) is the fact that it requires only computations of expectations with
respect to the tractable distribution Q instead of the intractable P.
We will specialize on the class of distribution P that are given by
e-H[S]
P(S) = -
Z-, (2)
where S = (S1, . . . , SN) is a vector of binary (spin) variables Si E {-I,+1} and
H[S] = - L SiJijSj-L SiOi .
i<j
Finally, the normalizing partition function is
Z
= L e-H[S] .
S
(3)
(4)
We are interested both in approximations to expectations like (Si) as well as in
approximations to the value of the free energy -In
Z
.
Inserting P into (22), we get
KL(QIIP) = In
Z
+E[Q] -S[Q] (5)
where
S[Q] = - L Q(S)In Q(S) (6)
S
is the entropy of the distribution Q (not to be confused with the random variable
S) and
E[Q] = L Q(S)H[S] (7)
S
is called the variational energy.
From Naive Mean Field Theory to the TAP Equations 9
The mean field approximation is obtained by taking the approximating family
M to be all product distributions, i.e.
Q(S) =
II Qj(Sj) .
j
For Si E {-1,+1}, the most general form of the Qj's is obviously of the form:
Q.(S..
.) - II
(1+ Sjmj)
J J,mJ - 2
j
(8)
(9)
where the mj's are variational parameters which are identified as the expectations
mj = (Sj)Q.
Using the statistical independence of the Sj's with respect to Q, the variational
entropy is found to be
S[Q] =
_ '" {1+ mi
In
1+ mi
+
1- mi
In
1- mi }
� 2 2 2 2
•
and the variational energy reduces to
E[Q] = (H[S])Q = - L Jijmimj - L mi()i .
i<j
(10)
(11)
Although the partition function Z cannot be computed efficiently, it will not be
needed because it does not depend on Q. Hence, all we have to do is to minimize
the variational free energy
F[Q] = E[Q] - S[Q] . (12)
Differentiating (12) with respect to the mi's gives the set ofN Mean Field Equations
i = 1,...,N. (13)
The intractable task of computing exact averages over P has been replaced by the
problem of solving the set (13) of nonlinear equations, which often be done in a time
that grows only polynomially with N. Note, that there might be many solutions to
(13) and some of them may not even be local minima of (12) but rather saddles.
Hence, solutions must be compared by their value of the variational free energy
F[Q].
As an extra bonus of the variational MF approximation we get an upper bound
on the exact free energy -lnZ. Since KL(QIIP) � 0, we have from (5)
-lnZ::::; E[Q] -S[Q] = F[Q] . (14)
Obviously, the mean field approximation takes into account the couplings Jij
between the random variables but neglects statistical correlations, in the sense
that (SiSj)Q = (Si)Q(Sj)Q. To get some more intuition about the effect of this
approximation, we can compare the mean field equations for mi = (Si)Q (13) with
a set of exact equations which hold for the true distribution P (2) . It is not hard to
10 Manfred Opper and Ole Winther
prove the so-called Callen equations (see e.g. chapter 3 of [19])
i = 1,...,N. (15)
Unfortunately both sides of (15) are formulated in terms of expectations (we have
omitted the subscript) with respect to the difficult P. While in (15) the expectation
is outside the nonlinear tanh function, the approximate (13) has the expectation
inside the tanh. Hence, the MF approximation replaces the fluctuating "field"
hi = L:jJijSj by (an approximation) to its mean field. Hence, estimating the
variance of hi may give us an idea of how good the approximation is. We will come
back to this question later.
3 The Linear Response Correction
Although the product distribution Q(S) neglects correlations between the random
variables, there is a simple way of computing a non-vanishing approximation to the
covariances(SiSj) -(Si)(Sj) based on the MF approach. By differentiating
(Si) = Z-l LSi e-H[S]
S
with respect to OJ, we obtain the linear response relation
(16)
(17)
(17) holds only for expectations with respect to the true P but not for the
approximating Q. Hoping that the MF method gives us a reasonable approximation
for(Si), we can compute the MF approximation to the left hand side of (17) and
get a nontrivial approximation to the right hand side. This approximation has
been applied to Boltzmann machines learning [11] and independent component
analysis [8].
4 The Field Theoretic Approach
Another way of obtaining a mean field theory is motivated by the idea that we often
have better approximations to the performance of integrals than to the calculation
of discrete sums. If we can replace the expectations over the random variables Si
by integrations over auxiliary "field variables", we can approximate the integrals
using the Laplace or saddle-point methods.
As an example, we consider a simple Gaussian transformation of (2). To avoid
complex representations we assume that the matrix J is positive definite so that
From Naive Mean Field Theory to the TAP Equations
we can write
exp [��S;J;;S;1
(21f)N/2�i:IfdXi e-� �ijXi(r")ijXj+�iXiSi .
11
(18)
This transformation is most easily applied to the partition function Z (4) yielding
(19)
where we have omitted some constants. In this representation, the sums over binary
variables factorize and can be carried out immediately with the result
Z ex Jr.rdXi ecI>(x) ,
,
(20)
where
(21)
Hence, we have transformed a high-dimensional sum into a high dimensional non­
Gaussian integral. Hoping, that the major contribution to the integral comes from
values of the function <I> close to its maximum, we replace the integral (20) by
(22)
where xO = arg max <I> (x). This is termed the Laplace approximation. Setting the
gradient V'x<I>(x) equal to zero, we get the set of equations
�)J-l)ijX� = tanh(x? +Bi) .
j
(23)
A comparison of (23) with (13) shows that by identifying the auxiliary variables x?
with the mean fields via
x? == LJijmj,
j
(24)
we recover the same mean field equations as before. This is easily understood from
the fact that we have replaced the integration variables Xi by constant values. This
leaves us with a partition function for the same type of factorizing distribution
Q(8) ex II eSj(x�+lij)
j
(25)
(written in a slightly different form) that we have used in the variational approach.
Hence, it seems we have not gained anything new. One might even argue that
we have lost something in this derivation, the bound on the free energy -In Z.
It is not clear how this could be proved easily within the Laplace approximation.
12 Manfred Opper and Ole Winther
However, we would like to argue that when interactions between random variables
are more complicated than in the simple quadratic model (7), the field-theoretic
approach decouples the original sums in a very simple and elegant way for which
there may not be an equivalent expression in the variational method. This can often
be achieved by using a Dirac 6-function representation which is given by
1 = Jdh 6(h-x) = J d��h
eih(h-x) , (26)
where the i = A in the exponent should not be confused with a variable index.
The transformation can be applied to partition functions of the type
Z
�Iff (�JjkSk)
�
J
If
{dhjf(hj) 6 (hj - �JjkSk)}
JUJdh
��h
j
) e-i '2:.; h;h; II {LeiSk '2:.; J;kh; }
J k Sk
(27)
(28)
Since the functions in (28) are no longer positive (in fact, not even real), the search
for a maximum in <I> must be replaced by the Saddle-point method where (after
a deformation of the path of integration in the the complex plane), one looks for
values of h and hfor which the corresponding exponent is stationary.
In general, the field theoretic MF approach does not have an equivalent vari­
ational formulation (in fact, depending on the way the auxiliary fields are chosen,
we may get different MF formulations). Hence, it is unclear if the approximation
to Z will lead to a bound for the free energy. While there is no general answer so
far, an example given in one of the chapters of this book [22] indicates that in some
cases this may still be true.
A further important feature of the saddle-point approximation is the fact that
it can be systematically improved by expanding <I> around the stationary value.
The inclusion of the quadratic terms may already give a dramatic improvement.
Applications of these ideas to graphical models can be found in this book [22; 2].
5 When does MFT become exact?
We have seen from the Callen equation (15) that the simple MF approximation
neglects the fluctuations of the fields
hi = LJijSj ,
j
(29)
which are sums of random variables. In the interesting case where N, the total
number of variables Sj is large one might hope that fluctuations could be small
assuming that the Sj are weakly dependent. We will compute crude estimates of
these fluctuations for two extreme cases.
From Naive Mean Field Theory to the TAP Equations 13
• Case I:
All couplings Jij are positive and equal. In order to keep the fields hi of order
0(1) when N grows large, we set Jij = JoiN. This model is known as the mean
field ferromagnet in Statistical Physics. If we make the crude approximation that
all variables Sj are independent, the variances Var(JijSj) = J6 ( 1 - (Sj)2) IN2 of
the individual terms in (29) simply add to a total variance of the fields Var(hi) =
O(I/N) for N --+ 00. Hence, in this case the MF approximation becomes exact.
A more rigorous justification of this result can be obtained within the field theoretic
framework of the previous section. The necessary Gaussian transformation for this
case is simpler than (18) and reads
(30)
Inserting (30) into the partition function (4) shows that Laplace's method for
performing the single integral over x is justified for N --+ 00 by the occurrence
of the factor N in the exponent.
In practical applications of MF methods, the couplings Jij are usually related
to some observed data and will not be constant but may rather show a strong
variability. Hence, it is interesting to study the
• Case II:
The Jij's are assumed to be independent random variables (for i < j) with zero
mean. Setting (J; = 0 for simplicity, we are now adding up N terms in (29) which
have roughly equal fractions of positive and negative signs. To keep the hi's of order
1, the magnitude of the Jij's should then scale like 11m. With the same arguments
as before, neglecting the dependencies of the Sj'S, we find that the variance of hi
is now 0(1) for N --+ 00 and the simple MF approximation fails to become exact.
As will be shown in the next section, the failure of the "naive" mean field theory
(13) in case II can be cured by a adding a suitable correction. This leads us to the
TAP mean field theory which is still a closed set of equations for the expectations
(Si). Under some conditions on the variance of the Jij's it is believed that these
mean field equations are exact for Case II in the limit N --+ 00 with probability 1
with respect to a random drawing of the Jij's.
In fact, it should be possible to construct an exact mean field theory for any
model where the Jij's are of "infinite range". The phrase infinite range is best
understood if we assume for a moment that the spins Si are located at sites i on a
finite dimensional lattice. If the Jij's do not decay to zero when the distance Iii -jII
is large, we speak of an infinite range model. In such cases, the "neighbors" Sj of
Si which contribute dominantly to the field hi (29) of a spin Si are not clustered
in a small neighborhood of site i but are rather distributed all over the system. In
such a case, we can expect that dependencies are weak enough to be treated well
in a mean field approximation. Especially, when the connections Jij between two
arbitrary spins Si and Sj are completely random (this includes sparse as well as
extensive connectivities), the model is trivially of infinite range.
14 Manfred Opper and Ole Winther
6 TAP equations I : The cavity approach
The TAP mean field equations are named after D.J. Thouless, P.W. Anderson and
R.G. Palmer [29] who derived a MF theory for the Sherrington-Kirkpatrick (SK)
model [26]. The SK model is of the type (3) where the couplings Jij are independent
Gaussian random variables for i < j with variance JoiN. For simplicity, we set the
mean equal to zero. We will give two derivations in this chapter. A further derivation
and generalizations is presented in another chapter of this book [10].
Perhaps the most intuitive one is the cavity method introduced by Parisi and
Mezard [16]. It is closely related to the Bethe approximation [3] which is an exact
mean field theory on a tree.
Our goal is to derive an approximation for the marginal distribution Pi(Si) for
each spin variable. We begin with the exact representation
Pi(Si) = L P(S) ex L eSi(�j JijSj+Oi) P(SSi) .
SSi SSi
(31)
P(SSi) equals the joint distribution of the N - 1 spins SSi for an auxiliary
system, where Si has been removed (by setting the Jij's equal to zero for all j =f. i).
If the graph of nonzero Jij's would be a tree, i.e., if it would contain no loops, the
Sj's would be fully independent after being disconnected from Si. In this case, the
joint distribution P(SSi) would factorize into a product of individual marginals
Pji(Sj). From this, one would obtain immediately the marginal distribution as
(32)
Within the tree assumption one could proceed further (in order to close the system
of equations) by applying the same procedure to each of the auxiliary marginals
Pji(Sj) and expressing them in terms of their neighbors (excluding Si). This
would lead us directly to the Belief-propagation (BP) algorithm [21] for recursively
computing a set of "messages" defined by
mji(Si) = L eSdijSjPji(Sj) . (33)
Sj
This approach as well as its applications will be presented in more detail in other
chapters [4; 30; 25; 32]. The route from the BP method to the TAP equations is
presented in [13].
We will follow a different route which leads to considerable simplifications
by utilizing the fact that the SK model is fully connected. Going back to the
formulation (7), we see that the only dependence between Si and the other variables
Sj is through the field hi = Lj JijSj. Hence, it is possible to rewrite the marginal
distribution (32) in terms of the joint distribution of Si and hi
(34)
From Naive Mean Field Theory to the TAP Equations
where we have introduced the "cavity" 1 distribution of hi as
P(hiSi) = L 8(hi -L JijSj) P(SSi) .
We get
SSi j
15
(35)
(36)
For the SK model the independence used in (32) does not hold, but one may argue
that it can be safely replaced in the following by sufficiently weak correlations. In
the limit N --t 00, we assume that this is enough to invoke a central limit theorem
for the field hi and replace (4) by the simple Gaussian distribution 2
P(h.S.) � _
I
_ ex (_(hi - (hi)i)2)
• • y'21fVi P 2Vi '
(37)
in the computation of (36). We have denoted an average over the cavity distribution
by Oi. Using (6) within (36) we get immediately
i = 1, . . . ,N , (38)
as the first part of the TAP equations. (38) should be compared to the corresponding
set of "naive" MF equations (13) which can be written as
i = 1,...,N. (39)
In order to close the system of equations we have to express the cavity expectations
(hi)i and the variances Vi in terms of the full expectations
(hi) = L Jij(Sj) .
j
Within the Gaussian approximation (6) we get
(hi) = L fdhi P(Si, hi) hi = (hi)i+Vi(Si) .
Si
(40)
(41)
Hence, only the variances Vi of the cavity field remain to be computed. By definition,
they are
Vi = L JijJik ((SjSk)i - (Sj)i(Sk)i) (42)
j,k
Since the Jij's are modeled as independent random variables we argue that the
fluctuations of the Vi's with respect to the random sampling of the couplings can
1 the name is is derived from the physical context,where hi is the magnetic field at the cavity
which is left when spin i is removed from the system.
2 The cavity method for a model with finite connectivity is discussed in [15].
16 Manfred Opper and Ole Winther
be neglected for N --+ 00 and we can safely replace Vi by
Vi = L J,& (l- (Sj)i) >::i � L (1- (Sj)2) , (43)
j j
where the bar denotes an average over the distribution of the Jij's. Note, that
by the independence of the couplings, the averages over the Jij's and the terms
(Sj)i factorize. To get the last expression in (43) we have assumed that both the
fluctuations the effect of removing Si can be neglected in the sum. From equations
(38),(41) and (43) we get the TAP equations for the SK model
(44)
where q = � Lj(l- (Sj)2). Equations (44) differ from the simple or "naive" MF
equations (13) by the correction -Jo(l-q)(Si), which is usually called the Onsager
Reaction Term. Although the simple MF approximation and the TAP approach are
based on weak correlations between random variables, the TAP approach makes this
assumption only when computing the distribution of the cavity field hi, i.e., for the
case when Si is disconnected from the system. The Onsager term is the difference
between (hi) and the cavity expectation (hi)i (compare (38) and (39)) and takes
into account the reaction of the neighbors Sj due to the correlations created by the
presence of Si.
A full discussion about why and when (44) yields an exact mean field theory for
the SK model is subtle and goes beyond the scope of this chapter. Interested readers
are referred to [16]. We can only briefly touch the problems. The main property in
deriving the TAP equations is the assumption of weak correlations expressed as
(45)
which can be shown to hold for the SK model when the size of the couplings Jo is
sufficiently small. In this case, there is only a single solution to (44). Things become
more complicated with increasing Jo. Analytical calculations show that one enters a
complex free energy landscape, i.e. a (spin glass) phase of the model where one has
exponentially many (in N) solutions. This corresponds to a multimodal distribution
with many equally important modes. (14) is no longer valid for a full average but
for local averages within a single mode. Numerical solutions to the TAP equations
turn out to be extremely difficult in this region [17] and not all of them can be
accepted because they violate the positive definiteness of the covariance matrix
(SiSj) - (Si)(Sj). For a setup of the cavity approach in this complex region see
chapter V of [16] and in this volume [31] which also discusses its application to
stochastic dynamics.
Finally, we want to mention the work of M. Talagrand (see e.g. [27]) who is
developing a rigorous mathematical basis for the cavity method.
From Naive Mean Field Theory to the TAP Equations
7 TAP equations II: Plefka's Expansion
17
Plefka's expansion [23] is a method for deriving the TAP equations by a systematic
perturbative computation of a function G(m) which is minimized by the vector of
expectationsm = (S).
To define G(m), we go back to the minimization of the variational free energy
(12), and do not restrict the distributions Q to be product distributions. We
minimize F(Q) = E[Q] - S[Q] in two steps: In the first step, we perform a
constrained minimization in the family of all distributions Qm which satisfy
(S)Q =m (46)
wherem is fixed. We define the Gibb '8 Free Energy as the constrained minimum
G(m) = min{E[Q] - S[Q] I (S)Q =m} .
Q
(47)
In the second step, we minimize G with respect to the vectorm. Since the full
minimizer of F[Q] equals the true distribution P, the minimizer of G(m) coincides
with the vector of true expectations (Si) .
Constrained optimization problems like (47) can be transformed into uncon­
strained ones by introducing appropriate Lagrange multipliers hi where we have to
minimize
(48)
and the hi's must be chosen such that (46) holds. (48) is again of the form of a
variational free energy (12) where H[Q] is replaced by H[Q] - Li hiSi . Hence, the
minimizing distribution is just
(49)
with Z-l(h) = Ls e-H[Sl+�ihiSi . Inserting this solution back into (47) yields
G(m, h) = Lhimi-lnLe-H[sl+�ihiSi . (50)
S
The condition (46) on the hi can be finally introduced by the variation on the
vector h
G(m) = m:;x {�himi -In �e-H[Sl+�ihiSi } . (51)
This follows by setting the gradient with respect to h equal to zero and checking the
matrix of second derivatives. The geometric meaning of the function G(m) within
Amari's Information Geometry is highlighted in the chapters [28; 1].
Why do we bother solving the more complicated 2-stage optimization process,
when computing G(m) is as complicated as computing the exact free energy
F[P] = -In Z? It turns out, that a useful perturbation expansion of G(m) with
respect to the complicated coupling term H[S] can be developed. We replace H[S]
18 Manfred Opper and Ole Winther
by 'H[S] in ( 51) and expand (setting (}i = 0 for simplicity)
,2
G(m) = Go(m)+,G1(m)+,G2(m)+. . .
2.
( 52)
with Gn = ::n G(m)1>..=0. The computation of the Gn is a bit tricky because one
also has to expand the Lagrange parameters hi which maximize ( 51) in powers of '.
However, the first two terms are simple. To zeroth order we obtain mi = tanh(h?)
and
G ( ) - 2: {1 +mi
I
1 +mi 1 - mi
I
1 - mi }
om - --- n ---
+ --- n --- .
. 2 2 2 2
•
( 53)
The calculation of the first order term is also simple, because the first derivative of
G at , = 0 can be written as an expectation of H[S] with respect to a factorizing
distribution with mean values (Si) = mi. We get
G1(m) = -
2:
Jijmimj .
i<j
(54)
A comparison of the first two terms with (12), (23) and (24) shows that we have
already recovered the simple mean field approximation. One can show that the
second order term in the expansion is
I" 2 2 2
G2(m) = -"2 � Jij(l-mi) (l-mj) .
ij
(55)
Minimizing (52) with respect tom for , = 1 and keeping only terms up to second
order, yields the TAP expansion (44)3 .
Plefka's method allows us to recover the TAP equations from a systematic
expansion, which in principle allows for improvements by adding higher order
terms. Corrections of this type can be found in other chapters in this book
[32; 28]. Moreover, the approximate computation of G(m) can be used to get an
approximation for the free energy -In Z = F[P] = minm G(m) as well.
For the SK model, Plefka [23] shows that all terms beyond second order in
the , expansion (52) can be neglected with probability 1 (with respect to random
drawings of the Jij's) for N ---+ 00 as long as we are not in the complex (spin glass)
phase of the model.
8 TAP equations III: Beyond the SK model
The TAP approach is special among the other mean field methods in the sense that
one has to make probabilistic assumptions on the couplings Jij in (3) in order to
derive the correct MF equations. This causes extra problems because the magnitude
of the Onsager correction term will depend on the distribution of Jij's. E.g., both
the SK model and the Hopfield model [6] belong to the same class of models (3)
but are defined by different probability distributions for the couplings Jij.
3 One also has to replace Ji� by its average.
From Naive Mean Field Theory to the TAP Equations 19
The weak correlations that are present between the couplings in the Hopfield
model prevent us from using the same arguments that has led us to (43). In fact,
the derivation presented in the chapter XIII of [16] leads to a different result. A
similar effect can be observed in the Plefka expansion (52). If the couplings are
not simple i.i.d. random variables, the expansion can not be truncated after the
second order term. An identification of terms which survive in the limit N --t 00 is
necessary [20].
Is there a general way of deriving the correct TAP equations for the different
distributions of couplings? The chapters [13] and [18] present different approaches
to this problem. The first one is based on identifying new auxiliary variables and
couplings between them for which independence is still valid. This leads to TAP
like equations which are valid even for a sparse connectivity of couplings. However,
the explicit knowledge of the underlying distribution of couplings is required. The
second approach motivated by earlier work of [20] develops an adaptive TAP method
which does not make explicit assumptions about the distribution. It is however
restricted to extensive connectivities.
9 Outlook
We have discussed different types of mean field methods in this chapter. Although
we were able to show that in certain limits these approximations become exact, we
can not give a general answer to the question how well they will perform on arbitrary
real data problems. The situation is perhaps simpler in statistical physics, where
there is often more detailed knowledge about the properties of a physical system
which helps to motivate a certain approximation scheme. Hence a critical reader
may argue that, especially in cases where MF approaches do not lead to a bound,
these approximations are somewhat uncontrolled and can not be trusted. We believe
that the situation is less pessimistic. We have seen in this chapter that the MF
equations often appear as low order terms in systematic perturbation expansions.
Hence, a computation of higher order terms can be useful to check the accuracy
of the approximation and may possibly also give error bars on the predictions. We
hope that further work in this direction will provide us with approximation methods
for complex probabilistic models which are both efficient as well as reliable.
References
[l]Amari S.,Ikeda S. and Shimokawa H.,this book.
[2]BarberD.,this book.
[3]Bethe H. A.,Proc. R. Soc. London,Ser A,151, 552 (1935).
[4]Frey B.J. and Koetter R.,this book.
[5]Ghahramani Z. and Beal M.J.,this book.
[6]Hopfield J.J.,Proc. Nat. Acad. Sci. USA,79 2554 (1982).
[7]Humphreys K. and Titterington D.M.,this book.
[8]H0jen-S0rensen,P.A.d.F.R.,Winther,0.,and Hansen,L. K.,Ensemble Learning and Linear
Response Theory for leA,Submitted to NIPS'2000 (2000).
[9]Jaakkola T.,this book.
[lO]Kappen H.J. and Wiegerinck W.,this book.
20 Manfred Opper and Ole Winther
[lllKappen H.J. and Rodriguez F.B.,Efficient Learning in Boltzmann Machines Using Linear
Response Theory,Neural Computation 10,1137 (1998).
[12lKabashima Y. and Saad D.,Belief propagation vs. TAP for decoding corrupted messages,
Europhys. Lett. 44, 668 (1998)
[13lKabashima Y. and Saad D.,this book.
[14lMezard M.,The Space of interactions in Neural Networks: Gardner's Computation with the
Cavity Method,J. Phys. A (Math. Gen. 22,2181 (1989).
[15lMezard M. and Parisi G.,Mean Field Theory of Randomly Frustrated Systems with Finite
Connectivity,Europhys. Lett. 3,1067 (1987).
[16lMezard M.,Parisi G. and Virasoro M.A.,Europhys. Lett. 1,77 (1986) and
Spin Glass Theory and Beyond, Lecture Notes in Physics,9,World Scientific (1987).
[17lNemoto K. and Takayama H.,J. Phys. C 18,L529 (1985).
[18l0pper M. and Winther 0.,this book.
[19lParisi G.,Statistical Field Theory,Addison Wesley,Reading Massachusetts (1988).
[20lParisi G. and Potters M.,Mean-Field Equations for Spin Models with Orthogonal
Interaction Matrices,J. Phys. A (Math. Gen.) 28,5267 (1995).
[21lPearl J.,Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,
Morgan Kaufmann,San Francisco (1988).
[22lPineda F.J.,Resch C. and Wang LJ.,this book.
[23lPlefka T.,Convergence condition of the TAP equations for the infinite-ranged Ising spin
glass model,J. Phys. A 15,1971 (1982).
[24lSaul L.K.,Jaakkola T.,Jordan M.L,Mean Field Theory for Sigmoid Belief Networks,J.
Artificial Intelligence Research 4, 61-76 (1996).
[25lSaad D.,Kabashima Y. and Vicente R.,this book.
[26lSherrington D. and Kirkpatrick S.,Phys. Rev. Lett. 35,1792 (1975).
[27lTalagrand M.,Self Averaging and the Space of Interactions in Neural Networks,Random
Structures and Algorithms 14, 199 (1998) and also papers on his webpage
http://www.math.ohio-state.edu/ - talagran/.
[28lTanaka T.,this book.
[29lThouless D.J.,Anderson P.W. and Palmer R.G.,Solution of a 'Solvable Model of a Spin
Glass',Phil. Mag. 35,593 (1977).
[30lWeiss Y.,this book.
[31lWong K.Y.,Li S. and Luo P.,this book.
[32lYedidia J.S.,this book.
[33lZinn-Justin J.,Quantum Field Theory and Critical Phenomena,Clarendon Press,Oxford
(1989).
3 An Idiosyncratic Journey Beyond
Mean Field Theory
Jonathan S. Yedidia
The connecting thread between the different methods described here
is the Gibbs free energy. After introducing the inference problem we
are interested in analyzing, I will define the Gibbs free energy, and
describe how to derive a mean field approximation to it using a vari­
ational approach. I will then explain how one might re-derive and
correct the mean field and TAP free energies using high tempera­
ture expansions with constrained one-node beliefs. I will explore the
relationships between the high-temperature expansion approach, the
Bethe approximation, and the belief propagation algorithm, and point
out in particular the equivalence of the Bethe approximation and be­
lief propagation. Finally, I will describe Kikuchi approximations to
the Gibbs Free energy and advertise new belief propagation algo­
rithms that efficiently compute beliefs equivalent to those obtained
from the Kikuchi free energy.
1 Introduction
In this chapter I will try to clarify the relationships between different ways of
deriving or correcting mean field theory. The December 1999 NIPS workshop on
"Advanced Mean Field Methods" succeeded nicely in bringing together physicists
and computer scientists, who nowadays often work on precisely the same problems,
but come to these problems with different perspectives, methods, names and
notations. Some of this chapter is therefore devoted to presenting translations
between the language of the physicist and the language of the computer scientist,
although I am sure that my original training as a physicist will show through.
I will only cover methods that I have personally used, so this chapter does
not attempt to be a thorough survey of its subject. Readers interested in more
background on the statistical physics of disordered systems (particularly with regard
to the technique of averaging over disorder using the replica method) might also
want to consult references [19],[28],and [31],while those interested in the computer
science literature on graphical models might consult references [23], [11] and [7].
2 Inference
We begin by describing the problem we will focus on. In the appealing computer
science jargon, this is the problem of "inference." We are given some complicated
probabilistic system, which we model by a pair-wise Markov network of N nodes.
We label the state of node i by Xi, and write the joint probability distribution
22 Jonathan S. Yedidia
function as
(1)
Here 'l/Jij(Xi,Xj) is the "compatibility" matrix between connected nodes i and j,
'l/Ji(Xi) is called the "evidence" for node i, and Z is a normalization constant called
the "partition function" by physicists. The notation (ij) means that the sum is over
connected nodes.
Such models have many applications, in fields as diverse as computer vision,
error-correcting codes, medical diagnosis, and condensed matter physics. It may
help your intuition to think of the medical diagnosis application. In such an
application, the nodes could represent symptoms and diseases that a patient may
have, and the links 'l/Jij(Xi,Xj) could represent the statistical dependencies between
the symptoms and diseases. Note that the links 'l/Jij(Xi,Xj) would not normally
change from one patient to the next. On the other hand, for each patient, we would
obtain a different set of evidence 'l/Ji(Xi), which would correspond to our knowledge
of the symptoms for that specific patient. We would like to use the model to infer
the probability that the patient has a specific disease-that is, we want to compute
a marginal probability like Pi(Xi),which is the probability that the patient has the
disease denoted by node i.
I will just give a very rough idea of how such a model might be useful for
other applications. In a computer vision application, we might be interested in
inferring the shape of an object from the evidence provided by the pixel values
of the image. In an error-correcting code, we might be interested in inferring
(decoding) the most likely interpretation of a noisy message, where the Markov
network itself enforces the error-correcting code. In condensed matter physics, we
might want to infer (predict) the response of a magnetic system to the "evidence"
of an inhomogeneous magnetic field. For the rest of the chapter, however, I will
not make specific interpretations of the meanings of the nodes, and focus on the
mathematics of the problem.
For some networks-small ones or networks that have the topology of a chain or
tree-we can compute any desired marginal probabilities exactly, either by explicitly
summing over all possible states of the system or by using dynamic programming
methods (we will return to the dynamic programming methods, which are also
called "belief propagation" algorithms, later in the chapter.) Otherwise, however, we
must settle for approximations. If we want to make a distinction between the exact
marginal probabilities and approximate ones (something physicists do not usually
bother doing explicitly), then we can call the approximation of the exact marginal
probability Pi(Xi) the "belief" bi(Xi), and similarly we call the approximation
of the exact two-node marginal probability Pij(Xi,Xj) the belief bij(Xi,Xj). The
mathematical problem we will focus on for the rest of this chapter is as follows: given
some arbitrary Markov network defined as in equation (1),compute as accurately
as possible any desired beliefs.
An Idiosyncratic Journey Beyond Mean Field Theory
3 Some Models from Statistical Physics
23
In statistical mechanics, we start with Boltzmann's law for computing joint prob­
ability functions:
(2)
where E is the energy of the system and T is the temperature. We can re-write
equation (1) in this way if we define
E(Xl,X2,...,XN) = - LJij(Xi,Xj)-Lhi(Xi)
(ij)
where the "bond strength" function Jij(Xi,Xj) is defined by:
and the "magnetic field" hi(Xi) is defined by:
(3)
Before turning to approximation methods, let us pause to consider some more
general and some more specific models. Turning first to more specific models, we
can obtain the Ising model by restricting each node i to have two states Si = ±1
(for the Ising case, we follow the physics convention and label the states by Si
instead of Xi), and insisting that the compatibility matrices ¢ij have the form
_ ( exp(JijIT) exp(-JijIT) ) . . £
¢ij - exp(-JijIT) exp(JijIT)
whlle the eVldence vectors have the lorm
¢i = (exp(hdT); exp(-hi)IT). In that case, we can write the energy as
E = - LJijSiSj - Lhisi· (4)
(ij)
If we further restrict the Jij to be uniform and positive, we obtain the ferromagnetic
Ising model, while if we assume the Jij are chosen from a random distribution, we
obtain an Ising spin glass. For these models, the magnetic field hi is usually, but
not always, assumed to be uniform.
We can create more general models by introducing tensors like ¢ijdxi,Xj,Xk)
in equation (1) or equivalently tensors like Jijk(Xi,Xj,Xk) in the energy. One can of
course introduce tensors of even higher order. In the extreme limit, one can consider
a model where E(Xl,X2,...,XN) = J12...N(Xl,X2,...,XN). If the Xi are binary and
the entries of this J tensor are chosen randomly from a Gaussian distribution, we
obtain Derrida's Random Energy Model [4].
So far, we have been implicitly assuming that the nodes in the Markov network
live on a fixed lattice and that each node can be in a discrete state Xi. In fact,
there is nothing to stop us from taking the Xi to be continuous variables, or we
can generalize to vectors Ti, where Ti can be interpreted as the position of the ith
particle in the system. Looking at it this way, we see that equation (3) can be
interpreted as an energy function for particles interacting by arbitrary two-body
24
forces in arbitrary one-body potentials.
4 The Gibbs Free Energy
Jonathan S. Yedidia
Statistical physicists often use the following algorithm when they consider some
new model of a physical system:
1. Write down the energy function.
2. Construct an approximate Gibbs free energy.
3. Solve the stationary conditions of the approximate Gibbs free energy.
4. Write paper.
To use this algorithm successfully, one needs to understand what a Gibbs free
energy is, and how one might successfully approximate it. We will explore this
subject from numerous points of view.
The exact Gibbs free energy Gexact can be thought of as a mathematical
construction designed so that when you minimize it, you will recover Boltzmann's
law. Gexact is a function of the full joint probability function P(Xl'X2,..., XN ) and
is defined by
(5)
where U is the average (or "internal") energy:
U= (6)
and S is the entropy:
S=- (7)
Xl,X2,···,XN
If we minimize Gexact with respect to P(Xl'X2,..., XN ) (one needs to remember
to add a Lagrange multiplier to enforce the constraint LX1,X2,...,XN
P(Xl'X2,..., XN )=
1),we do indeed recover Boltzmann's Law (equation (2)) as desired. If we substitute
in P=exp(-E/T)/Z into Gexact, we find that at equilibrium (that is, when the
joint probability distribution has its correct value), the Gibbs free energy is equal
to the Helmholtz free energy defined by F == -TIn Z.
One can understand things this way: the Helmholtz free energy is just a number
equal to U - TS at equilibrium, but the Gibbs free energy is a function that
gives the value of U - TS when some constraints are applied. In the case of
Gexact, we constrain the whole joint probablity function P(Xl'X2,...,XN ). In other
cases that we will look at shortly, we will just constrain some of the marginal
probabilities. In general, there can be more than one "Gibbs free energy"-which
one you are talking about depends on which additional constraints you want to
apply. When we minimize a Gibbs free energy with respect to those probabilities
that were constrained, we will obtain self-consistent equations that must be obeyed
in equilibrium.
An Idiosyncratic Journey Beyond Mean Field Theory 25
The advantage of working with a Gibbs free energy instead of Boltzmann's Law
directly is that it is much easier to come up with ideas for approximations. There
are in fact many different approximations that one could make to a Gibbs free
energy, and much of the rest of this chapter is devoted to surveying them.
5 Mean Field Theory: The Variational Approach
One very popular way to construct an approximate Gibbs free energy involves a
variational argument. The derivation given here will be from a physicist's perspec­
tive; for an introduction to variational methods from a different point of view, see
[12]. Assume that we have some system which can be in, say, K different states.
The probability of each state is some number Pa where L�=l Pa = 1. Let there be
some quantity Xa (like the energy) which depends on which state the system is in,
and introduce the notation for the mean value
K
(X) == L PaXa· (8)
a=l
Then by the convexity of the exponential function, we can prove that
(9)
Now consider the partition function
Z = L exp(-Ea/T). (10)
Let us introduce some arbitrary "trial" energy function E�. We can manipulate Z
into the form
Z =
Laexp(-(Ea - E�)/T)exp(-E�/T) '" (-EO/T)
La exp(-E&/T) L...- exp a
a
or
Z = (e-(E-EO)/T)°
L exp(-E�/T)
a
(11)
(12)
where the notation (X)o means the average of Xa using a trial probability distri­
bution
°
exp(-E�/T)
P -
a -
La exp(-E&/T)
.
We can now use the inequality (9) to assert that
Z � e-((E-EO)/T)o L exp(-E�/T)
(13)
(14)
for any function E�. In terms of the Helmholtz free energy F == -TIn Z, we can
equivalently assert that
(15)
26 Jonathan S. Yedidia
where we define the quantity on the right-hand side of the inequality as the
variational mean field free energy Fvar corresponding to the trial probability
function p�. A little more manipulation gives us
Fvar = ( E)o -TSo � F (16)
where So is the trial entropy defined by So = -Eo< p�Inp�. This inequality gives
us a useful variational argument: we will look for the trial probability function p�
which gives us the lowest variational free energy.
To be able to use the variational principle in practice, we must restrict ourselves
to a class of probabilities for which we can actually analytically compute Fvar.
The quality of the variational approximation will depend on how well the trial
probability function can represent the true one. For continuous Xi or Ti, one
can use Gaussians as very good, yet tractable variational functions [28; 2; 3].
Richard Feynman was one of the first physicists to use this kind of variational
argument (with Gaussian trial probability functions) in his treatment of the polaron
problem [5].
The variational probability functions that are tractable for discrete Xi are not
nearly as good. When people talk about "mean field theory," they are usually
referring to using a trial probability function of the factorized form
(17)
and computing Fvar for some energy function of a form like equation (3). The
"mean field" Gibbs free energy that results is
GMF -L L Jij(Xi,Xj)bi(Xi)bj(xj) - L L hi(Xi)bi(Xi)
(ij) Xi ,X; Xi
Xi
(18)
To obtain the beliefs in equilibrium according to this approximation, one
minimizes GMF with respect to the beliefs bi(Xi). Let us see how this works for
the Ising model with no external field. In that case, it makes sense to define the
local magnetization
(19)
which is a scalar that can take on values from -1to 1.In terms of the magnetization,
we have
- LJijmimj
(ij)
+T � [1+
2
miIn C+
2
mi) +
1-
2
miIn C-
2
mi)]
,
and the mean field stationary conditions are
(20)
(21)
An Idiosyncratic Journey Beyond Mean Field Theory 27
If we further specialize to the case of a ferromagnet on ad-dimensional hyper­
cubic lattice, set all the Jij =
2
1d' and assume that all mi are equal to the same
magnetization m, we can analytically analyze the solutions of this equation. We
find that above Te = 1, the only solution is m = 0, while below Te, we have two
other solutions with positive or negative magnetization. This is a classic example of
a phase transition that breaks the underlying symmetry in a model. The mean field
prediction of a phase transition is qualitatively correct for dimensiond � 2. Other
bulk thermodynamic quantities like the susceptibility X == amlah and the specific
heat C == auIaT are also easy to compute once we have the stationary conditions.
How good an approximation does mean field theory give? It depends a lot on
the model. For the Ising ferromagnet, mean field theory becomes exact for a hyper­
cubic lattice in the limit of infinite dimensions, or for an "infinite-ranged" lattice
where every node is connected to every other node. On the other hand, for lower
dimensional ferromagnets, or spin glasses in any dimension, mean field theory can
give quite poor results. In general, mean field theory does badly when the nodes in
a network fluctuate a lot around their mean values, because it incorrectly insists
that all two-node beliefs bij(Xi,Xj) are simply given by bij(Xi,Xj) = bi(Xi)bj(xj).
In practice, one sees many papers where questionable mean field approximations
are used when it would not have been too difficult to obtain better results using
one of the techniques that I describe in the rest of the chapter.
6 Correcting Mean Field Theory
Mean field theory is exact for the infinite-ranged ferromagnet, so when physicists
started contemplating spin glasses in the 1970's, they quickly turned to the simplest
corresponding model: the infinite-ranged Sherrington-Kirpatrick (SK) Ising spin
glass model with zero field and Jij's chosen from a zero-mean Gaussian distribution
[25]. Thouless, Anderson and Palmer (TAP) presented "as a fait accomplz" [26] a
Gibbs free energy that they claimed should be exact for this model:
-(3GTAP L [1 + mi (1 + mi) 1 - mi (1 - mi)]
- -- In --
+ -- In --
. 2 2 2 2
•
+(3 L Jijmimj + �2
L f&(1 -m;)(1 -m�)
(ij) (ij)
(22)
where (3 == liT is the inverse temperature. The only difference between the TAP
and ordinary mean field free energy is the last term, which is sometimes called the
"Onsager reaction" term.
I have written the TAP free energy in a suggestive form: it appears to be a
Taylor expansion in powers of (3. Plefka showed that one could in fact derive GTAP
from such a Taylor expansion [24]. Antoine Georges and I later [10] showed how to
continue the Taylor expansion to terms beyond 0((32), and exploited this kind of
expansion for a variety of statistical mechanical [8; 30] and quantum mechanical [9]
models. Of course, the higher-order terms are important for any model that is
not infinite-ranged. Because this technique is little-known, but quite generally
28 Jonathan S. Yedidia
applicable, I will review it here using the Ising spin glass energy function.
The variational approximation gives a rigorous upper bound on the Helmholtz
free energy, but there is no reason to believe that it is the best approximation one
can make for the magnetization-dependent Gibbs free energy. We can construct
such a Gibbs free energy by adding a set of external auxilary fields (Lagrange
multipliers) that are used to insure that all the magnetizations are constrained to
their desired values. Note that the auxiliary fields are temperature-dependent. Of
course, when the magnetizations are at their equilibrium values, no auxiliary fields
will be necessary. We write
(23)
where the )..( (3) are our auxiliary fields.
We can use this exact formula to expand - (3G( (3,mi) around (3 = 0:
(24)
At (3 = 0,the spins are entirely controlled by their auxiliary fields, and so we have
reduced our problem to one of independent spins. Since mi is fixed equal to (Si) for
any inverse temperature (3,it is in particular equal to (Si) when (3 = 0,which gives
us the relation
(25)
From the definition of - (3G( (3,mi) given in equation (23),we find that
(26)
Eliminating the )..i( O),we obtain
-( (3G){3=o = -L [1 +
2
miIn C+
2
mi) +
1 -
2
miIn C-
2
mi)]
,
(27)
which is just the mean field entropy. Considering next the first derivative, we find
that
- (3 (a
�(3
G)) = (3 (LJijSiSj) + (3 (Si -mi){3=0 �)..
(3
i
_ .
{3=0 (. .) {3-0
'J {3=0
At (3 = 0,the two-node correlation functions factorize so we find that
which is, of course, the same as the variational internal energy term.
(28)
(29)
Naturally, we can continue this expansion to arbitrarily high order if we work
hard enough. Unfortunately, neither Georges and I, nor Parisi and Potters who
An Idiosyncratic Journey Beyond Mean Field Theory 29
later examined this expansion [22], were able to derive the Feynman rules for a
fully diagrammatic expansion, but there are some tricks that make the computation
easier [10]. To order /34,we find that
-/3G =
- � [1 +
2
miIn C+
2
mi) +
1 -
2
miIn C-
2
mi)]
•
+/3 LJijmimj
(ij)
+�
2
L J�(1 -m�)(1 -m�)
(ij)
+
2
�3
L Jtmi(1 -m�)mj(1 -m�)
(ij)
+/33 L JijJjkJki(1 -m�)(1 -m�)(1 -m�)
(ijk)
-�� L J0(1 -m�)(1 -m�)(1 + 3m� + 3m� -15m�m�)
(ij)
+2/34 L J�Jjdkimi(1 -m�)mj(l -m�)(1 -m�)
(ijk)
+/34 L JijJjdkdli(1 -m�)(I -m�)(I -m�)(1 -m?)
(ijkl)
+... (30)
where the notation (ij), (ijk), or (ijkl) means that one should sum over all distinct
pairs, triplets, or quadruplets of spins.
For the ferromagnet on ad-dimensional hypercubic lattice, all these terms
can be reorganized according to their contribution in powers of lid. It is easy
to show that only the mean field terms contribute in the limitd ---+ 00 and to
generate lid expansions for all the bulk thermodynamic quantities, including the
magnetization [10].
A few points should be made about the Taylor expansion of equation (30). First,
as with any Taylor expansion, there is a danger that the radius of convergence of the
expansion will be too small to obtain results for the value of /3 you are interested in.
It is hard to say anything about this issue in general. For ferromagnets, there does
not seem to be any problem at low or high temperatures, but for the SK model,
the issue is non-trivial and was analyzed by Plefka [24].
Secondly, since the expansion was presented as one that starts at /3 =
0, it is
initially surprising that it can work at low temperatures. The explanation, at least
for the ferromagnetic case, is that the higher-order terms become exponentially
small in the limit T ---+ O. Thus, the expansion works very well for T ---+ 0 or T ---+ 00
and is worst near Tc.
Finally, the TAP free energy is sometimes justified as a "Bethe approximation,"
that is, as an approximation that would become exact on a tree-like lattice [1].
In fact, the general convention in the statistical physics community is to refer to
30 Jonathan S. Yedidia
the technique of using a Bethe approximation on a inhomogeneous model as the
"TAP approach." In general, to obtain the proper Bethe approximation from the
expansion (30) for models on a tree-like lattice, we need to sum over all the higher­
order terms that do not include loops of nodes. The TAP free energy for the SK
model only simplifies because for that model all terms of order /33 or higher are
believed to vanish anyways in the limit N -+ 00 (which is the "thermodynamic
limit" physicists are interested in). In the next section, we will describe a much
simpler way to arrive at the important Bethe approximation.
7 The Bethe Approximation
The remaining sections of this chapter will discuss the Bethe and Kikuchi approx­
imations and belief propagation algorithms. My understanding of these subjects
was formed by a collaboration with Bill Freeman at MERL and Yair Weiss at
Berkeley. These sections can be considered an introduction to the work that we did
together [29].
So far we have discussed Gibbs free energies with just one-node beliefs bi(Xi)
constrained. The next obvious step to take is to constrain the two-node beliefs
bij(Xi,Xj) as well. For Markov networks that have a tree-like topology, taking this
step is sufficient to obtain the exact Gibbs free energy. The reason is that for these
models, the exact joint probability distribution itself can be factorized into a form
that only depends on one-node and two-node marginal probabilities:
p(Xl,X2,...,XN) = IIpij(xi,Xj) II[pi(xiW-qi
(ij)
where qi is the number of nodes that are connected to node i.
(31)
Recall that the exact Gibbs free energy is G = U - TS, where the internal
energy U = Lcr.pcr.Eoil the entropy S = Lcr.pcr.lnPcr.,and a is an index over every
possible state. Using equation (31),we find that the exact entropy for models with
tree-like topology is
(32)
(ij) Xi ,Xj Xi
The average energy can be expressed exactly in terms of one-node and two-node
marginal probabilities for pair-wise Markov networks of any topology:
U = - L L Pij(Xi,Xj)(Jij(Xi,x,j) + hi(Xi) + hj(xj))
(ij) Xi ,Xj
(33)
Xi
The first term is just the average energy of each link, and the second term is a
correction for the fact that the evidence at each node is counted qi - 1 times too
many.
The Bethe approximation to the Gibbs free energy amounts to using these
expressions (with beliefs substituting for exact marginal probabilities) for any pair-
An Idiosyncratic Journey Beyond Mean Field Theory
wise Markov network:
LLbij(Xi,Xj)(TInbij(Xi,Xj) + Eij(Xi,Xj))
(ij) Xi ,Xj
Xi
31
(34)
where we have introduced the local energies Ei(Xi) == -hi(Xi) and Eij(Xi,Xj) ==
-Jij(Xi,Xj) -hi(xi) -hj(Xj). Of course, the beliefs bij(Xi,Xj) and bi(Xi) must obey
the standard normalization conditions EXi
bi(Xi) = 1and Eijbij(Xi,Xj) = 1and
marginalization conditions bi(Xi) = Ex. bij(Xi,Xj).
3
There is more than one way to obtain the stationarity conditions for the
Bethe free energy. For inhomogeneous models, the most straightforward approach
is to form a Lagrangian L by adding Lagrange multipliers which enforce the
normalization and marginalization conditions and to differentiate the Lagrangian
with respect to the beliefs and those Lagrange multipliers. We have
L = GBethe + LLAij(Xj) (bj(Xj) -Lbij(Xi,Xj))
(ij) Xj Xi
+ LLAji(Xi) (bi(Xi) -Lbij(Xi,Xj))
(ij) Xi Xj
+ �'Yi (1-Lbi(Xi)) + 2;:'Yij (1- Lbij(Xi,Xj))
o Xi (OJ) Xi ,Xj
(35)
Of course, the derivatives with respect to the Lagrange multipliers give back
the desired constraints, while the derivatives with respect to the beliefs give back
equations for beliefs in terms of Lagrange multipliers:
(36)
and
(37)
where Zi and Zij are constants which enforce the normalization conditions. Finally
one can use the marginalization conditions to obtain self-consistent equations for
the Lagrange multipliers.
The Bethe approximation is a significantly better approximation to the Gibbs
free energy than the mean field approximation. The only real difficulty is a practical
one: how do we minimize the Bethe free energy efficiently? As we shall see, it turns
out that the belief propagation algorithm, which was developed by Pearl following
an entirely different path, provides a possible answer.
32 Jonathan S. Yedidia
8 Belief Propagation
Belief propagation algorithms can probably best be understood by imagining
that each node in a Markov network represents a person, who communicates by
"messages" with those people on connected nodes about what their beliefs should
be. Let us see what the properties of these messages should be if we want to get
reasonable equations for the beliefs bi(Xi). We will denote the message from node j
to node i by Mji(Xi). Note that the message has the same dimensionality as node i­
the person at j is telling the one at i something like "you should believe in your state
1 twice as strongly as your state 2, and your state number 3 should be impossible."
That message would be the vector (2, 1, 0). Now imagine that the person at node
i is looking at all the messages that he is getting, plus the independent evidence
that he alone is receiving denoted by 'l/Ji(Xi). Assume that each message is arriving
independently and is reliably informing the person at node i about something he
has no other way of finding out. Given equally reliable messages and evidence, what
should his beliefs be? A reasonable guess would be
bi(Xi) = a'I/Ji(xi) II Mji(Xi) (38)
jEN(i)
where a is a normalization constant, and N(i) denotes all the nodes neighboring i.
Thus a person following this rule who got messages (2, 1, 0) and (1, 1, 1) and had
personal evidence (1, 2, 1) would have a belief (.5, .5, 0). His thought process would
work like this: "The first message is telling me that state 3 is impossible, the second
message can be ignored because it is telling me it does not care, while my personal
evidence is telling me to believe in state 2 twice as strongly as state 1, which is the
opposite of what the first message tells me, so I will just believe in state 1 and state
2 equally strongly."
Now consider the joint beliefs of a pair of neighboring nodes i and j. Clearly
they must depend on the compatibility matrix 'l/Jij(Xi,Xj),the evidence at each node
'l/Ji(Xi) and 'l/Jj(Xj), and all the messages coming into nodes i and j. The obvious
guess would be the rule
kEN(i)j lEN(j)i
If we combine these rules for the one-node and two-node beliefs with the marginal­
ization condition
Xj
we obtain the self-consistent equations for the messages
Mij(Xj) = aL'l/Jij(Xi,Xj)'l/Ji(Xi) II Mki(Xi)
Xi kEN(i)j
(40)
(41)
where N(i)j means all nodes neighboring i except for j. The belief propagation
algorithm amounts to solving these message equations iteratively, and using the
An Idiosyncratic Journey Beyond Mean Field Theory 33
solution for the messages in the belief equations.
So far I have probably just convinced you that the belief propagation algorithm
is vaguely plausible. Pearl did more than that of course-he showed directly that
all the belief propagation equations written above are exact for Markov networks
that have a tree-like topology [23). One might note that this fact was already
partially known in the physics literature-as long ago as 1979, T. Morita wrote
down the correct belief propagation equations for the case of an Ising spin glass in
a random field [20). Of course, the suitability of these equations as an algorithm
was not appreciated. Recently, Y. Kabashima and D. Saad [13; 14) have shown
that for a number of other specific disordered models, the TAP approach and belief
propagation give rise to identical equations, and speculated that this might be true
in general.
Freeman, Weiss and I have shown that this identity does in fact hold in gen­
eral [29). To prove it for general Markov networks, you simply need to identify the
following relationship between the Lagrange multipliers Aij(Xj) that we introduced
in the last section and the messages Mij(xj):
Aij(Xj) = TIn II Mkj(xj) (42)
kEN(j)i
Using this relation, one can easily show that equations (36) and (37) derived for
the Bethe approximation in the last section are equivalent to the belief propagation
equations (38) and (39).
9 Kikuchi Approximations and Generalized Belief
Propagation
Pearl pointed out that belief propagation was not exact for networks with loops, but
that has not stopped a number of researchers from using it on such networks, often
very successfully. One particularly dramatic case is near Shannon-limit performance
of "Turbo codes" and low density parity check codes, whose decoding algorithm is
equivalent to belief propagation on a network with loops [18; 17). For some problems
in computer vision involving networks with loops, belief propagation has worked
well and converged very quickly [7; 6; 21). On the other hand, for other networks
with loops, belief propagation gives poor results or fails to converge [21; 29).
What has been generally missing has been an idea for how one might system­
atically correct belief propagation in a way that preserves its main advantage-the
rapidity with which it normally converges [27). The idea which turned out to be
successful was to work out approximations to the Gibbs free energy that are even
more accurate than the Bethe approximation, and find corresponding "generalized"
belief propagation algorithms.
Once one has the idea of improving the approximation for the Gibbs free
energy by constraining two-node beliefs like bij(Xi,Xj), it is natural to go further
and constrain higher-order beliefs as well. The "cluster variation method," which
was invented by Kikuchi [15; 16), is a way of obtaining increasingly accurate
approximations in precisely this way. The idea is to group the nodes of the
34 Jonathan S. Yedidia
Markov network into basic (possibly overlapping) clusters, and then to compute
an approximation to the Gibbs free energy by summing the free energies of the
basic clusters, minus the free energy of over-counted intersections of clusters, minus
the free energy of over-counted intersections of intersections, and so on. The Bethe
approximation is the simplest example of one of these more complicated Kikuchi
free energies: for that case, the basic clusters are all the connected pairs of nodes.
Every Kikuchi free energy will handle the average energy exactly, and the entropy
will become increasingly accurate as the size of the basic clusters increases.
Rather than repeat analysis that you can find elsewhere, I will just advertise the
results of our work [29]. One can indeed derive new belief propagation algorithms
based on Kikuchi free energies. They converge to beliefs that are provably equivalent
to the beliefs that are obtained from the Kikuchi stationary conditions. The new
messages that need to be introduced involve groups of nodes telling other groups of
nodes what their joint beliefs should be. These new belief propagation algorithms
have the attractive feature of being user-adjustable: by paying some additional
computational cost, you can buy additional accuracy. In practice, the additional
cost is not great: we found that we were able to obtain dramatic improvements
in accuracy at negligible cost for some models where ordinary belief propagation
performs poorly.
Acknowledgments
It is a pleasure to thank my collaborators Jean-Philippe Bouchaud, Bill Freeman,
Antoine Georges, Marc Mezard, and Yair Weiss with whom I have enjoyed exploring
the issues described in this chapter.
References
[l]Bethe H.A., Proc. Royal Soc. of London A, 150, 552, 1935.
[2]Bouchaud J.P., Mezard M., Parisi G. and Yedidia J.S., J. Phys. A, 24, L1025, 1991.
[3]Bouchaud J.P., Mezard M., and Yedidia J.S. Phys. Rev B 46, 14686, 1992.
[4]Derrida. B., Phys. Rev B 24, 2613, 1981.
[5]Feynman R.P., Phys. Rev 97, 660,1955.
[6]Freeman W.T. and Pasztor E., 7th International Conference Computer Vision, 1182, 1999.
[7]Frey B.J. Graphical Models for Machine Learning and Digital Communication, Cambridge:
MIT Press, 1998.
[8]Georges A., Mezar M.D., and Yedidia J.S., Phys. Rev. Lett. 64, 2937, 1990.
[9]Georges A. and Yedidia J.S., Phys. Rev B 43, 3475, 1991.
[10]Georges A. and Yedidia J.S., J. Phys. A 24, 2173, 1991.
[ll]Jordan M.L, ed., Learning in Graphical Models, Cambridge: MIT Press, 1998.
[12]Jordan M.L, Ghahramani Z., Jaakola T., and Saul L.K., Learning in Graphical Models, M.L
Jordan ed., Cambridge: MIT Press, 1998.
[13]Kabashima Y., and Saad D., Europhys. Lett. 44, 668, 1998.
[14]Kabashima Y., and Saad D., Contribution to this volume, 2000.
[15]Kikuchi R., Phys. Rev. 81, 988, 1951.
[16]Kikuchi R., Special issue in honor of R. Kikuchi, Prog. Theor. Phys. Supp!., 115, 1994.
[17]MacKay D.J.C., IEEE Trans. on Inf. Theory, 1999.
[18]McEliece R., MacKay D.J.C. and Cheng J., IEEE J. on Sel Areas in Comm. 16 (2), 140, 1998.
An Idiosyncratic Journey Beyond Mean Field Theory 35
[19]Mezard M., Parisi G. and Virasoro M.A., Spin Glass Theory and Beyond, Singapore: World
Scientific, 1987.
[20]Morita T., Physica 98A, 566, 1979.
[21]Murphy K., Weiss Y. and Jordan M., in Proc. Uncertainty in AI, 1999.
[22]Parisi G. and Potters M., J. Phys. A 28, 5267, 1995.
[23]Pearl J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, San
Francisco: Morgan Kaufman, 1988.
[24]Plefka T., J. Phys. A 15, 1971, 1982.
[25]Sherrington D. and Kirpatrick S., Phys. Rev. Lett. 35, 1792, 1975.
[26]Thouless D.J., Anderson P.W. and Palmer R.G., Phil. Mag. 35, 593, 1977.
[27]Weiss Y., Bayesian Belief Propagation for Image Understanding, available at Yair Weiss's
homepage, 1999.
[28]Yedidia J.S., 1992 Lectures in Complex Systems, L. Nadel and D. Stein, eds.,
Addison-Wesley, 299, 1993.
[29]Yedidia J.S., Freeman W.T. and Weiss Y., MERL TR2000-26 available at
http://www.merl.com/reports/TR2000-26/. 2000.
[30]Yedidia J.S. and Georges A., J. Phys. A 23, 2165, 1990.
[31]Young A.P., Spin Glasses and Random Fields, World Scientific, ed. 1998.
4 Mean Field Theory for Graphical
Models
Hilbert J. Kappen and Wim J. Wiegerinck
In this chapter, mean field theory is introduced from an information
theoretic view point. The mean field approximation is defined as
the factorized distribution that is closest to the target distribution.
When using the KL divergence to define closeness, this factorized
distribution must have equal marginals as the target distribution.
Such marginals can be approximately computed by using a Taylor
series expansion in the couplings around the factorized distribution.
To lowest order in the couplings, the usual naive mean field equations
are obtained and to second order, one obtains the TAP equations. An
important advantage of this procedure is that it does not require the
concept of a free energy. Therefore, it can be applied to arbitrary
probability distributions, such as arising in asymmetric stochastic
neural networks and graphical models.
1 Introduction
During the last few years, the use of probabilistic methods in artificial intelligence
and machine learning has gained enormous popularity. In particular, probabilis­
tic graphical models have become the preferred method for knowledge represen­
tation and reasoning [4]. The advantage of the probabilistic approach is that all
assumptions are made explicit in the modeling process and that consequences, such
as predictions on novel data, are assumption free and follow from a mechanistic
computation. The drawback of the probabilistic approach is that the method is
intractable. This means that the typical computation scales exponentially with the
problem size.
Recently, a number of authors have proposed methods for approximate inference
in large graphical models. The simplest approach gives a lower bound on the
probability of a subset of variables using Jenssen's inequality [14]. The method
involves the minimization of the KL divergence between the target probability
distribution p and some 'simple' variational distribution q. The method can be
applied to a any probability model, whether directed or undirected.
The Boltzmann-Gibbs distributions is widely used in physics, and mean field
theory has been known for these distributions for a long time. For instance, for the
Ising model on a square lattice, it is known as the Bragg-Williams approximation [3]
and it is generalized to other models in the Landau theory [10]. One can show that
the above lower bound corresponds to the first term in a Taylor series expansion of
the free energy around a factorized model. This Taylor series can be continued
and the second order term is known as the Thouless Anderson Palmer (TAP)
correction [16; 13; 6; 7]. The second order term significantly improves the quality
38 Hilbert J. Kappen and Wim J. Wiegerinck
of the approximation, depending on the amount of frustration in the system, but
is no longer a bound.
For probability distributions that are not Boltzmann-Gibbs distributions, it is
not obvious how to obtain the second order approximation. However, there is an
alternative way to compute the higher order corrections, based on an information
theoretic argument. The general approach to this mean field approximation is
introduced in section 2. Before, we work out the mean field approximations for the
general case, we first illustrate this idea for Boltzmann distributions in section 3.
Subsequently, in section 4 we consider the general case. Finally, in section 5 we
illustrate the approach for sigmoid belief networks.
2 Mean field theory
In this section we consider a form of mean field theory that was previously proposed
by Plefka [13] for Boltzmann-Gibbs distributions. It turns out, however, that the
restriction to Boltzmann-Gibbs distributions is not necessary and one can derive
results that are valid for arbitrary probability distributions. We therefore consider
the general case.
Our argument uses an information geometric viewpoint. For an introduction
to this approach see for instance [1]. Let x=(Xl'. . . ' x
n) be an n-dimensional
vector, with Xi taking on discrete values. Let p(x
I0) be a probability distribution
on x
, parametrized by 0. Let P ={p(x
I0n be the manifold of all the probability
distributions that can be obtained by considering different values of 0.
We now assume that that P contains a submanifold of factorized probability
distributions in the following sense. We assume that the parametrization 0 is such
that it can be divided into two subsets, 0 = (0
, w), and that the submanifold
M c P of factorized probability distributions is described by w =o. ° parametrizes
the factorized distributions in the manifold M, and w parametrizes the remainder
of the manifold P. We will denote factorized distributions by q(x
IO) =p(x
IO, w =0).
Consider an arbitrary probability distribution p(x
IO, w) E P. We define its mean
field approximation as the factorized distribution q(x
lo
q) E M that is closest to
p(x
IO, w). As a distance measure, we use the Kulback-Leibler divergence [1; 17] 1
" p(x
IO, w)
KL =L..,.p(x
IO, w) log
q(x
lo
q) .
x
(1)
Since q(x
lo
q) is a factorized distribution, q(x
lo
q) = n�=l qi(Xil0
£), we can find
the closest q by differentiating the Kulback-Leibler divergence with respect to
these independent components qi(Xil0
£). Using a Lagrange multiplier to ensure
normlisation of qi(Xil0
£), one finds that this optimal q must satisfy
qi(Xil0
£) =p(x
iIO, w), (
2)
where p(x
iIO, w) is the marginal distribution of p(x
IO, w) on variable Xi.
1 Note, that to obtain the standard variational bound using Jensen's inequality, one employs
'the other' KL divergence with the roles of p and q reversed. As will be outlined below, the KL
divergence considered here gives the same result as the Jensen's bound to lowest order.
Random documents with unrelated
content Scribd suggests to you:
conditions which you desired to establish; from this hour begins the
new life of which you dreamed. Whether you have been wise or
rash, you can change nothing. You are limited, as before, though
within a different circle. You may pace it to its fullest extent, but all
the lessons you have yet learned require you to be satisfied within
it."
Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad
CHAPTER XIII.
PRESENTIMENTS.
The autumn lapsed into winter, and the household on the Asten
farm began to share the isolation of the season. There had been
friendly visits from all the nearest neighbors and friends, followed by
return visits, and invitations which Julia willingly accepted. She was
very amiable, and took pains to confirm the favorable impression
which she knew she had made in the summer. Everybody remarked
how she had improved in appearance, how round and soft her neck
and shoulders, how bright and fresh her complexion. She thanked
them, with many grateful expressions to which they were not
accustomed, for their friendly reception, which she looked upon as
an adoption into their society; but at home, afterwards, she indulged
in criticisms of their manners and habits which were not always
friendly. Although these were given in a light, playful tone, and it
was sometimes impossible not to be amused, Rachel Miller always
felt uncomfortable when she heard them.
Then came quiet, lonely days, and Julia, weary of her idle life,
undertook to master the details of the housekeeping. She went from
garret to cellar, inspecting every article in closet and pantry,
wondering much, censuring occasionally, and only praising a little
when she found that Rachel was growing tired and irritable.
Although she made no material changes, it was soon evident that
she had very stubborn views of her own upon many points, and
possessed a marked tendency for what the country people call
"nearness." Little by little she diminished the bountiful, free-handed
manner of provision which had been the habit of the house. One
could not say that anything needful was lacking, and Rachel would
hardly have been dissatisfied, had she not felt that the innovation
was an indirect blame.
In some directions Julia seemed the reverse of "near," persuading
Joseph into expenditures which the people considered very
extravagant. When the snow came, his new and elegant sleigh, with
the wolf-skin robe, the silver-mounted harness, and the silver-
sounding bells, was the envy of all the young men, and an
abomination to the old. It was a splendor which he could easily
afford, and he did not grudge her the pleasure; yet it seemed to
change his relation to the neighbors, and some of them were very
free in hinting that they felt it so. It would be difficult to explain why
they should resent this or any other slight departure from their
fashions, but such had always been their custom.
In a few days the snow vanished and a tiresome season of rain
and thaw succeeded. The south-eastern winds, blowing from the
Atlantic across the intervening lowlands, rolled interminable gray
masses of fog over the hills and blurred the scenery of the valley;
dripping trees, soaked meadows, and sodden leaves were the only
objects that detached themselves from the general void, and
became in turn visible to those who travelled the deep, quaking
roads. The social intercourse of the neighborhood ceased perforce,
though the need of it were never so great: what little of the main
highway down the valley was visible from the windows appeared to
be deserted.
Julia, having exhausted the resources of the house, insisted on
acquainting herself with the barn and everything thereto belonging.
She laughingly asserted that her education as a farmer's wife was
still very incomplete; she must know the amount of the crops, the
price of grain, the value of the stock, the manner of work, and
whatever else was necessary to her position. Although she made
many pretty blunders, it was evident that her apprehension was
unusually quick, and that whatever she acquired was fixed in her
mind as if for some possible future use. She never wearied of the
most trivial details, while Joseph, on the other hand, would often
have willingly shortened his lessons. His mind was singularly
disturbed between the desire to be gratified by her curiosity, and the
fact that its eager and persistent character made him uncomfortable.
When an innocent, confiding nature begins to suspect that its
confidence has been misplaced, the first result is a preternatural
stubbornness to admit the truth. The clearest impressions are
resisted, or half-consciously misinterpreted, with the last force of an
illusion which already foresees its own overthrow. Joseph eagerly
clung to every look and word and action which confirmed his sliding
faith in his wife's sweet and simple character, and repelled—though a
deeper instinct told him that a day would come when it must be
admitted—the evidence of her coldness and selfishness. Yet, even
while almost fiercely asserting to his own heart that he had every
reason to be happy, he was consumed with a secret fever of unrest,
doubt, and dread.
The horns of the growing moon were still turned downwards, and
cold, dreary rains were poured upon the land. Julia's patience, in
such straits, was wonderful, if the truth had been known, but she
saw that some change was necessary for both of them. She
therefore proposed, not what she most desired, but what her
circumstances prescribed,—a visit from her sister Clementina. Joseph
found the request natural enough: it was an infliction, but one which
he had anticipated; and after the time had been arranged by letter,
he drove to the station to meet the westward train from the city.
Clementina stepped upon the platform, so cloaked and hooded
that he only recognized her by the deliberate grace of her
movements. She extended her hand, giving his a cordial pressure,
which was explained by the brass baggage-checks thus transferred
to his charge.
"I will wait in the ladies' room," was all she said.
At the same moment Joseph's arm was grasped.
"What a lucky chance!" exclaimed Philip: then, suddenly pausing
in his greeting, he lifted his hat and bowed to Clementina, who
nodded slightly as she passed into the room.
"Let me look at you!" Philip resumed, laying his hands on Joseph's
shoulders. Their eyes met and lingered, and Joseph felt the blood
rise to his face as Philip's gaze sank more deeply into his heart and
seemed to fathom its hidden trouble; but presently Philip smiled and
said: "I scarcely knew, until this moment, that I had missed you so
much, Joseph!"
"Have you come to stay?" Joseph asked.
"I think so. The branch railway down the valley, which you know
was projected, is to be built immediately; but there are other
reasons why the furnaces should be in blast. If it is possible, the
work—and my settlement with it—will begin without any further
delay. Is she your first family visit?"
He pointed towards the station.
"She will be with us a fortnight; but you will come, Philip?"
"To be sure!" Philip exclaimed. "I only saw her face indistinctly
through the veil, but her nod said to me, 'A nearer approach is not
objectionable.' Certainly, Miss Blessing; but with all the conventional
forms, if you please!"
There was something of scorn and bitterness in the laugh which
accompanied these words, and Joseph looked at him with a puzzled
air.
"You may as well know now," Philip whispered, "that when I was a
spoony youth of twenty, I very nearly imagined myself in love with
Miss Clementina Blessing, and she encouraged my greenness until it
spread as fast as a bamboo or a gourd-vine. Of course, I've long
since congratulated myself that she cut me up, root and branch,
when our family fortune was lost. The awkwardness of our
intercourse is all on her side. Can she still have faith in her charms
and my youth, I wonder? Ye gods! that would be a lovely conclusion
of the comedy!"
Joseph could only join in the laugh as they parted. There was no
time to reflect upon what had been said. Clementina, nevertheless,
assumed a new interest in his eyes; and as he drove her towards the
farm, he could not avoid connecting her with Philip in his thoughts.
She, too, was evidently preoccupied with the meeting, for Philip's
name soon floated to the surface of their conversation.
"I expect a visit from him soon," said Joseph. As she was silent,
he ventured to add: "You have no objections to meeting with him, I
suppose?"
"Mr. Held is still a gentleman, I believe," Clementina replied, and
then changed the subject of conversation.
Julia flew at her sister with open arms, and showered on her a
profusion of kisses, all of which were received with perfect serenity,
Clementina merely saying, as soon as she could get breath: "Dear
me, Julia, I scarcely recognize you! You are already so countrified!"
Rachel Miller, although a woman, and notwithstanding her recent
experience, found herself greatly bewildered by this new apparition.
Clementina's slow, deliberate movements and her even-toned,
musical utterance impressed her with a certain respect; yet the
qualities of character they suggested never manifested themselves.
On the contrary, the same words, in any other mouth, would have
often expressed malice or heartlessness. Sometimes she heard her
own homely phrases repeated, as if by the most unconscious
purposeless imitation, and had Julia either smiled or appeared
annoyed her suspicions might have been excited; as it was, she was
constantly and sorely puzzled.
Once only, and for a moment, the two masks were slightly lifted.
At dinner, Clementina, who had turned the conversation upon the
subject of birthdays, suddenly said to Joseph: "By the way, Mr.
Asten, has Julia told you her age?"
Julia gave a little start, but presently looked up, with an
expression meant to be artless.
"I knew it before we were married," Joseph quietly answered.
Clementina bit her lip. Julia, concealing her surprise, flashed a
triumphant glance at her sister, then a tender one at Joseph, and
said: "We will both let the old birthdays go; we will only have one
and the same anniversary from this time on!"
Joseph felt, through some natural magnetism of his nature rather
than from any perceptible evidence, that Clementina was sharply
and curiously watching the relation between himself and his wife. He
had no fear of her detecting misgivings which were not yet
acknowledged to himself, but was instinctively on his guard in her
presence.
It was not many days before Philip called. Julia received him
cordially, as the friend of her husband, while Clementina bowed with
an impassive face, without rising from her seat. Philip, however,
crossed the room and gave her his hand, saying cheerily: "We used
to be old friends, Miss Blessing. You have not forgotten me?"
"We cannot forget when we have been asked to do so," she
warbled.
Philip took a chair. "Eight years!" he said: "I am the only one who
has changed in that time."
Julia, looked at her sister, but the latter was apparently absorbed
in comparing some zephyr tints.
"The whirligig of time!" he exclaimed: "who can foresee anything?
Then I was an ignorant, petted young aristocrat,—an expectant heir;
now behold me, working among miners and puddlers and forgemen!
It's a rough but wholesome change. Would you believe it, Mrs.
Asten, I've forgotten the mazurka!"
"I wish to forget it," Julia replied: "the spring-house is as
important to me as the furnace to you."
"Have you seen the Hopetons lately?" Clementina asked.
Joseph saw a shade pass over Philip's face, and he seemed to
hesitate a moment before answering: "I hear they will be neighbors
of mine next summer. Mr. Hopeton is interested in the new branch
down the valley, and has purchased the old Calvert property for a
country residence."
"Indeed? Then you will often see them."
"I hope so: they are very agreeable people. But I shall also have
my own little household: my sister will probably join me."
"Not Madeline!" exclaimed Julia.
"Madeline," Philip answered. "It has long been her wish, as well as
mine. You know the little cottage on the knoll, at Coventry, Joseph! I
have taken it for a year."
"There will be quite a city society," murmured Clementina, in her
sweetest tones. "You will need no commiseration, Julia. Unless,
indeed, the country people succeed in changing you all into their
own likeness. Mrs. Hopeton will certainly create a sensation. I am
told that she is very extravagant, Mr. Held?"
"I have never seen her husband's bank account," said Philip, dryly.
He rose presently, and Joseph accompanied him to the lane.
Philip, with the bridle-rein over his arm, delayed to mount his horse,
while the mechanical commonplaces of speech, which, somehow,
always absurdly come to the lips when graver interests have
possession of the heart, were exchanged by the two. Joseph felt,
rather than saw, that Philip was troubled. Presently the latter said:
"Something is coming over both of us,—not between us. I thought I
should tell you a little more, but perhaps it is too soon. If I guess
rightly, neither of us is ready. Only this, Joseph, let us each think of
the other as a help and a support!"
"I do, Philip!" Joseph answered. "I see there is some influence at
work which I do not understand, but I am not impatient to know
what it is. As for myself, I seem to know nothing at all; but you can
judge,—you see all there is."
Even as he pronounced these words Joseph felt that they were
not strictly sincere, and almost expected to find an expression of
reproof in Philip's eyes. But no: they softened until he only saw a
pitying tenderness. Then he knew that the doubts which he had
resisted with all the force of his nature were clearly revealed to
Philip's mind.
They shook hands, and parted in silence; and Joseph, as he
looked up to the gray blank of heaven, asked himself: "Is this all?
Has my life already taken the permanent imprint of its future?"
Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad
CHAPTER XIV.
THE AMARANTH.
Clementina returned to the city without having made any very
satisfactory discovery. Her parting was therefore conventionally
tender: she even thanked Joseph for his hospitality, and endeavored
to throw a little natural emphasis into her words as she expressed
the hope of being allowed to renew her visit in the summer.
During her stay it seemed to Joseph that the early harmony of his
household had been restored. Julia's manner had been so gentle and
amiable, that, on looking back, he was inclined to believe that the
loneliness of her new life was alone responsible for any change. But
after Clementina's departure his doubts were reawakened in a more
threatening form. He could not guess, as yet, the terrible chafing of
a smiling mask; of a restraint which must not only conceal itself, but
counterfeit its opposite; of the assumption by a narrow, cold, and
selfish nature of virtues which it secretly despises. He could not have
foreseen that the gentleness, which had nearly revived his faith in
her, would so suddenly disappear. But it was gone, like a glimpse of
the sun through the winter fog. The hard, watchful expression came
back to Julia's face; the lowered eyelids no longer gave a fictitious
depth to her shallow, tawny pupils; the soft roundness of her voice
took on a frequent harshness, and the desire of asserting her own
will in all things betrayed itself through her affected habits of
yielding and seeking counsel.
She continued her plan of making herself acquainted with all the
details of the farm business. When the roads began to improve, in
the early spring, she insisted in driving to the village alone, and
Joseph soon found that she made good use of these journeys in
extending her knowledge of the social and pecuniary standing of all
the neighboring families. She talked with farmers, mechanics, and
drovers; became familiar with the fluctuations in the prices of grain
and cattle; learned to a penny the wages paid for every form of
service; and thus felt, from week to week, the ground growing more
secure under her feet.
Joseph was not surprised to see that his aunt's participation in the
direction of the household gradually diminished. Indeed, he scarcely
noticed the circumstance at all, but he was at last forced to remark
her increasing silence and the trouble of her face. To all appearance
the domestic harmony was perfect, and if Rachel Miller felt some
natural regret at being obliged to divide her sway, it was a matter, he
thought, wherein he had best not interfere. One day, however, she
surprised him by the request:—
"Joseph, can you take or send me to Magnolia to-morrow?"
"Certainly, Aunt!" he replied. "I suppose you want to visit Cousin
Phebe; you have not seen her since last summer."
"It was that,—and something more." She paused a moment, and
then added, more firmly: "She has always wished that I should make
my home with her, but I couldn't think of any change so long as I
was needed here. It seems to me that I am not really needed now."
"Why, Aunt Rachel!" Joseph exclaimed, "I meant this to be your
home always, as much as mine! Of course you are needed,—not to
do all that you have done heretofore, but as a part of the family. It is
your right."
"I understand all that, Joseph. But I've heard it said that a young
wife should learn to see to everything herself, and Julia, I'm sure,
doesn't need either my help or my advice."
Joseph's face became very grave. "Has she—has she—?" he
stammered.
"No," said Rachel, "she has not said it—in words. Different persons
have different ways. She is quick, O very quick!—and capable. You
know I could never sit idly by, and look on; and it's hard to be
directed. I seem to belong to the place and everything connected
with it; yet there's times when what a body ought to do is plain."
In endeavoring to steer a middle course between her conscience
and her tender regard for her nephew's feelings Rachel only
confused and troubled him. Her words conveyed something of the
truth which she sought to hide under them. She was both angered
and humiliated; the resistance with which she had attempted to
meet Julia's domestic innovations was no match for the latter's
tactics; it had gone down like a barrier of reeds and been
contemptuously trampled under foot. She saw herself limited,
opposed, and finally set aside by a cheerful dexterity of
management which evaded her grasp whenever she tried to resent
it. Definite acts, whereon to base her indignation, seemed to slip
from her memory, but the atmosphere of the house became fatal to
her. She felt this while she spoke, and felt also that Joseph must be
spared.
"Aunt Rachel," said he, "I know that Julia is very anxious to learn
everything which she thinks belongs to her place,—perhaps a little
more than is really necessary. She's an enthusiastic nature, you
know. Maybe you are not fully acquainted yet; maybe you have
misunderstood her in some things: I would like to think so."
"It is true that we are different, Joseph,—very different. I don't
say, therefore, that I'm always right. It's likely, indeed, that any
young wife and any old housekeeper like myself would have their
various notions. But where there can be only one head, it's the
wife's place to be that head. Julia has not asked it of me, but she
has the right. I can't say, also, that I don't need a little rest and
change, and there seems to be some call on me to oblige Phebe.
Look at the matter in the true light," she continued, seeing that
Joseph remained silent, "and you must feel that it's only natural."
"I hope so," he said at last, repressing a sigh; "all things are
changing."
"What can we do?" Julia asked, that evening, when he had
communicated to her his aunt's resolution; "it would be so delightful
if she would stay, and yet I have had a presentiment that she would
leave us—for a little while only, I hope. Dear, good Aunt Rachel! I
couldn't help seeing how hard it was for her to allow the least
change in the order of housekeeping. She would be perfectly happy
if I would sit still all day and let her tire herself to death; but how
can I do that, Joseph? And no two women have exactly the same
ways and habits. I've tried to make everything pleasant for her: if
she would only leave many little matters entirely to me, or at least
not think of them,—but I fear she cannot. She manages to see the
least that I do, and secretly worries about it, in the very kindness of
her heart. Why can't women carry on partnerships in housekeeping
as men do in business? I suppose we are too particular; perhaps I
am just as much so as Aunt Rachel. I have no doubt she thinks a
little hardly of me, and so it would do her good—we should really
come nearer again—if she had a change. If she will go, Joseph, she
must at least leave us with the feeling that our home is always hers,
whenever she chooses to accept it."
Julia bent over Joseph's chair, gave him a rapid kiss, and then
went off to make her peace with Aunt Rachel. When the two women
came to the tea-table the latter had an uncertain, bewildered air,
while the eyelids of the former were red,—either from tears or much
rubbing.
A fortnight afterwards Rachel Miller left the farm and went to
reside with her widowed niece, in Magnolia.
The day after her departure another surprise came to Joseph in
the person of his father-in-law. Mr. Blessing arrived in a hired vehicle
from the station. His face was so red and radiant from the March
winds, and perhaps some private source of satisfaction, that his
sudden arrival could not possibly be interpreted as an omen of ill-
fortune. He shook hands with the Irish groom who had driven him
over, gave him a handsome gratuity in addition to the hire of the
team, extracted an elegant travelling-satchel from under the seat,
and met Joseph at the gate, with a breezy burst of feeling:—
"God bless you, son-in-law! It does my heart good to see you
again! And then, at last, the pleasure of beholding your ancestral
seat; really, this is quite—quite manorial!"
Julia, with a loud cry of "O pa!" came rushing from the house.
"Bless me, how wild and fresh the child looks!" cried Mr. Blessing,
after the embrace. "Only see the country roses on her cheeks!
Almost too young and sparkling for Lady Asten, of Asten Hall, eh? As
Dryden says, 'Happy, happy, happy pair!' It takes me back to the
days when I was a gay young lark; but I must have a care, and not
make an old fool of myself. Let us go in and subside into soberness:
I am ready both to laugh and cry."
When they were seated in the comfortable front room, Mr.
Blessing opened his satchel and produced a large leather-covered
flask. Julia was probably accustomed to his habits, for she at once
brought a glass from the sideboard.
"I am still plagued with my old cramps," her father said to Joseph,
as he poured out a stout dose. "Physiologists, you know, have
discovered that stimulants diminish the wear and tear of life, and I
find their theories correct. You, in your pastoral isolation and
pecuniary security, can form no conception of the tension under
which we men of office and of the world live, Beatus ille, and so
forth,—strange that the only fragment of Latin which I remember
should be so appropriate! A little water, if you please, Julia."
In the evening, when Mr. Blessing, slippered, sat before the open
fireplace, with a cigar in his mouth, the object of his sudden visit
crept by slow degrees to the light. "Have you been dipping into oil?"
he asked Joseph.
Julia made haste to reply. "Not yet, but almost everybody in the
neighborhood is ready to do so now, since Clemson has realized his
fifty thousand dollars in a single year. They are talking of nothing
else in the village. I heard yesterday, Joseph, that Old Bishop has
taken three thousand dollars' worth of stock in a new company."
"Take my advice, and don't touch 'em!" exclaimed Mr. Blessing.
"I had not intended to," said Joseph.
"There is this thing about these excitements," Mr. Blessing
continued: "they never reach the rural districts until the first sure
harvest is over. The sharp, intelligent operators in the large cities—
the men who are ready to take up soap, thimbles, hand-organs,
electricity, or hymn-books, at a moment's notice—always cut into a
new thing before its value is guessed by the multitude. Then the
smaller fry follow and secure their second crop, while your quiet men
in the country are shaking their heads and crying 'humbug!' Finally,
when it really gets to be a humbug, in a speculative sense, they just
begin to believe in it, and are fair game for the bummers and camp-
followers of the financial army. I respect Clemson, though I never
heard of him before; as for Old Bishop, he may be a very worthy
man, but he'll never see the color of his three thousand dollars
again."
"Pa!" cried Julia, "how clear you do make everything. And to think
that I was wishing—O, wishing so much!—that Joseph would go into
oil."
She hung her head a little, looking at Joseph with an affectionate,
penitent glance. A quick gleam of satisfaction passed over Mr.
Blessing's face; he smiled to himself, puffed rapidly at his cigar for a
minute, and then resumed: "In such a field of speculation everything
depends on being initiated. There are men in the city—friends of
mine—who know every foot of ground in the Alleghany Valley. They
can smell oil, if it's a thousand feet deep. They never touch a thing
that isn't safe,—but, then, they know what's safe. In spite of the
swindling that's going on, it takes years to exhaust the good points;
just so sure as your honest neighbors here will lose, just so sure will
these friends of mine gain. There are millions in what they have
under way, at this moment."
"What is it?" Julia breathlessly asked, while Joseph's face betrayed
that his interest was somewhat aroused.
Mr. Blessing unlocked his satchel, and took from it a roll of paper,
which he began to unfold upon his knee. "Here," he said, "you see
this bend of the river, just about the centre of the oil region, which is
represented by the yellow color. These little dots above the bend are
the celebrated Fluke Wells; the other dots below are the equally
celebrated Chowder Wells. The distance between the two is nearly
three miles. Here is an untouched portion of the treasure,—a pocket
of Pactolus waiting to be rifled. A few of us have acquired the land,
and shall commence boring immediately."
"But," said Joseph, "it seems to me that either the attempt must
have been made already, or that the land must command such an
enormous price as to lessen the profits."
"Wisely spoken! It is the first question which would occur to any
prudent mind. But what if I say that neither is the case? And you,
who are familiar with the frequent eccentricities of old farmers, can
understand the explanation. The owner of the land was one of your
ignorant, stubborn men, who took such a dislike to the prospectors
and speculators, that he refused to let them come near him. Both
the Fluke and Chowder Companies tried their best to buy him out,
but he had a malicious pleasure in leading them on to make
immense offers, and then refusing. Well, a few months ago he died,
and his heirs were willing enough to let the land go; but before it
could be regularly offered for sale, the Fluke and Chowder Wells
began to flow less and less. Their shares fell from 270 to 95; the
supposed value of the land fell with them, and finally the moment
arrived when we could purchase for a very moderate sum. I see the
question in your mind; why should we wish to buy when the other
wells were giving out? There comes in the secret, which is our
veritable success. Consider it whispered in your ears, and locked in
your bosoms,—torpedoes! It was not then generally exploded (to
carry out the image), so we bought at the low figure, in the very
nick of time. Within a week the Fluke and Chowder Wells were
torpedoed, and came back to more than their former capacity; the
shares rose as rapidly as they had fallen, and the central body we
hold—to which they are, as it were, the two arms—could now be
sold for ten times what it cost us!"
Here Mr. Blessing paused, with his finger on the map, and a light
of merited triumph in his eyes. Julia clapped her hands, sprang to
her feet, and cried: "Trumps at last!"
"Ay," said he, "wealth, repose for my old days,—wealth for us all,
if your husband will but take the hand I hold out to him. You now
know, son-in-law, why the endorsement you gave me was of such
vital importance; the note, as you are aware, will mature in another
week. Why should you not charge yourself with the payment, in
consideration of the transfer to you of shares of the original stock,
already so immensely appreciated in value? I have delayed making
any provision, for the sake of offering you the chance."
Julia was about to speak, but restrained herself with an apparent
effort.
"I should like to know," Joseph said, "who are associated with you
in the undertaking?"
"Well done, again! Where did you get your practical shrewdness?
The best men in the city!—not only the Collector and the Surveyor,
but Congressman Whaley, E. D. Stokes, of Stokes, Pirricutt and
Company, and even the Reverend Doctor Lellifant. If I had not been
an old friend of Kanuck, the agent who negotiated the purchase, my
chance would have been impalpably small. I have all the documents
with me. There has been no more splendid opportunity since oil
became a power! I hesitate to advise even one so near to me in
such matters; but if you knew the certainties as I know them, you
would go in with all your available capital. The excitement, as you
say, has reached the country communities, which are slow to rise
and equally slow to subside; all oil stock will be in demand, but the
Amaranth,—'The Blessing,' they wished to call it, but I was obliged
to decline, for official reasons,—the Amaranth shares will be the
golden apex of the market!"
Julia looked at Joseph with eager, hungry eyes. He, too, was
warmed and tempted by the prospect of easy profit which the
scheme held out to him; only the habit of his nature resisted, but
with still diminishing force. "I might venture the thousand," he said.
"It is no venture!" Julia cried. "In all the speculations I have heard
discussed by pa and his friends, there was nothing so admirably
managed as this. Such a certainty of profit may never come again. If
you will be advised by me, Joseph, you will take shares to the
amount of five or ten thousand."
"Ten thousand is exactly the amount I hold open," Mr. Blessing
gravely remarked. "That, however, does not represent the necessary
payment, which can hardly amount to more than twenty-five per
cent. before we begin to realize. Only ten per cent. has yet been
called, so that your thousand at present will secure you an
investment of ten thousand. Really, it seems like a fortunate
coincidence."
He went on, heating himself with his own words, until the
possibilities of the case grew so splendid that Joseph felt himself
dazzled and bewildered. Mr. Blessing was a master in the art of
seductive statement. Even where he was only the mouthpiece of
another, a few repetitions led him to the profoundest belief. Here
there could be no doubt of his sincerity, and, moreover, every
movement from the very inception of the scheme, every statistical
item, all collateral influences, were clear in his mind and instantly
accessible. Although he began by saying, "I will make no estimate of
the profits, because it is not prudent to fix our hopes on a positive
sum," he was soon carried far away from this resolution, and most
luxuriously engaged, pencil in hand, in figuring out results which
drove Julia wild with desire, and almost took away Joseph's breath.
The latter finally said, as they rose from the session, late at night:—
"It is settled that I take as much as the thousand will cover; but I
would rather think over the matter quietly for a day or two before
venturing further."
"You must," replied Mr. Blessing, patting him on the shoulder.
"These things are so new to your experience, that they disturb and—
I might almost say—alarm you. It is like bringing an increase of
oxygen into your mental atmosphere. (Ha! a good figure: for the
result will be, a richer, fuller life. I must remember it.) But you are a
healthy organization, and therefore you are certain to see clearly: I
can wait with confidence."
The next morning Joseph, without declaring his purpose, drove to
Coventry Forge to consult Philip. Mr. Blessing and Julia, remaining at
home, went over the shining ground again, and yet again,
confirming each other in the determination to secure it. Even
Joseph, as he passed up the valley in the mild March weather, taking
note of the crimson and gold of the flowering spice-bushes and
maple-trees, could not prevent his thoughts from dwelling on the
delights of wealth,—society, books, travel, and all the mellow,
fortunate expansion of life. Involuntarily, he hoped that Philip's
counsel might coincide with his father-in-law's offer.
But Philip was not at home. The forge was in full activity, the
cottage on the knoll was repainted and made attractive in various
ways, and Philip would soon return with his sister to establish a
permanent home. Joseph found the sign-spiritual of his friend in
numberless little touches and changes; it seemed to him that a new
soul had entered into the scenery of the place.
A mile or two farther up the valley, a company of mechanics and
laborers were apparently tearing the old Calvert mansion inside out.
House, barn, garden, and lawn were undergoing a complete
transformation. While he paused at the entrance of the private lane,
to take a survey of the operations, Mr. Clemson rode down to him
from the house. The Hopetons, he said, would migrate from the city
early in May: work had already commenced on the new railway, and
in another year a different life would come upon the whole
neighborhood.
In the course of the conversation Joseph ventured to sound Mr.
Clemson in regard to the newly formed oil companies. The latter
frankly confessed that he had withdrawn from further speculation,
satisfied with his fortune; he preferred to give no opinion, further
than that money was still to be made, if prudently placed. Tho Fluke
and Chowder Wells, he said, were old, well known, and profitable.
The new application of torpedoes had restored their failing flow, and
the stock had recovered from its temporary depreciation. His own
venture had been made in another part of the region.
The atmosphere into which Joseph entered, on returning home,
took away all further power of resistance. Tempted already, and
impressed by what he had learned, he did what his wife and father-
in-law desired.
Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad
CHAPTER XV.
A DINNER PARTY.
Having assumed the payment of Mr. Blessing's note, as the first
instalment upon his stock, Joseph was compelled to prepare himself
for future emergencies. A year must still elapse before the term of
the mortgage upon his farm would expire, but the sums he had
invested for the purpose of meeting it when due must be held ready
for use. The assurance of great and certain profit in the mean time
rendered this step easy; and, even at the worst, he reflected, there
would be no difficulty in procuring a new mortgage whereby to
liquidate the old. A notice which he received at this time, that a
second assessment of ten per cent. on the Amaranth stock had been
made, was both unexpected and disquieting. Mr. Blessing, however,
accompanied it with a letter, making clear not only the necessity, but
the admirable wisdom of a greater present outlay than had been
anticipated. So the first of April—the usual business anniversary of
the neighborhood—went smoothly by. Money was plenty, the Asten
credit had always been sound, and Joseph tasted for the first time a
pleasant sense of power in so easily receiving and transferring
considerable sums.
One result of the venture was the development of a new phase in
Julia's nature. She not only accepted the future profit as certain, but
she had apparently calculated its exact amount and framed her
plans accordingly. If she had been humiliated by the character of
Joseph's first business transaction with her father, she now made
amends for it. "Pa" was their good genius. "Pa" was the agency
whereby they should achieve wealth and social importance. Joseph
now had the clearest evidence of the difference between a man who
knew the world and was of value in it, and their slow, dull-headed
country neighbors. Indeed, Julia seemed to consider the Asten
property as rather contemptible beside the splendor of the Blessing
scheme. Her gratitude for a quiet home, her love of country life, her
disparagement of the shams and exactions of "society," were given
up as suddenly and coolly as if she had never affected them. She
gave herself no pains to make the transition gradual, and thus lessen
its shock. Perhaps she supposed that Joseph's fresh, unsuspicious
nature was so plastic that it had already sufficiently taken her
impress, and that he would easily forget the mask she had worn. If
so, she was seriously mistaken.
He saw, with a deadly chill of the heart, the change in her manner,
—a change so complete that another face confronted him at the
table, even as another heart beat beside his on the dishallowed
marriage-bed. He saw the gentle droop vanish from the eyelids,
leaving the cold, flinty pupils unshaded; the soft appeal of the half-
opened lips was lost in the rigid, almost cruel compression which
now seemed habitual to them; all the slight dependent gestures, the
tender airs of reference to his will or pleasure, had rapidly
transformed themselves into expressions of command or obstinate
resistance. But the patience of a loving man is equal to that of a
loving woman: he was silent, although his silence covered an ever-
increasing sense of outrage.
Once it happened, that after Julia had been unusually eloquent
concerning "what pa is doing for us," and what use they should
make of "pa's money, as I call it," Joseph quietly remarked:—
"You seem to forget, Julia, that without my money not much could
have been done."
An angry color came into her face; but, on second thought, she
bent her head, and murmured in an offended voice: "It is very mean
and ungenerous in you to refer to our temporary poverty. You might
forget, by this time, the help pa was compelled to ask of you."
"I did not think of that!" he exclaimed. "Besides, you did not seem
entirely satisfied with my help, at the time."
"O, how you misunderstand me!" she groaned. "I only wished to
know the extent of his need. He is so generous, so considerate
towards us, that we only guess his misfortune at the last moment."
The possibility of being unjust silenced Joseph. There were tears
in Julia's voice, and he imagined they would soon rise to her eyes.
After a long, uncomfortable pause, he said, for the sake of changing
the subject: "What can have become of Elwood Withers? I have not
seen him for months."
"I don't think you need care to know," she remarked. "He's a
rough, vulgar fellow: it's just as well if he keeps away from us."
"Julia! he is my friend, and must always be welcome to me. You
were friendly enough towards him, and towards all the
neighborhood, last summer: how is it that you have not a good word
to say now?"
He spoke warmly and indignantly. Julia, however, looked at him
with a calm, smiling face. "It is very simple," she said. "You will
agree with me, in another year. A guest, as I was, must try to see
only the pleasant side of people: that's our duty; and so I enjoyed—
as much as I could—the rusticity, the awkwardness, the ignorance,
the (now, don't be vexed, dear!)—the vulgarity of your friend. As
one of the society of the neighborhood, as a resident, I am not
bound by any such delicacy. I take the same right to judge and
select as I should take anywhere. Unless I am to be hypocritical, I
cannot—towards you, at least—conceal my real feelings. How shall I
ever get you to see the difference between yourself and these
people, unless I continually point it out? You are modest, and don't
like to acknowledge your own superiority."
She rose from the table, laughing, and went out of the room
humming a lively air, leaving Joseph to make the best of her words.
A few days after this the work on the branch railway, extending
down the valley, reached a point where it could be seen from the
Asten farm. Joseph, on riding over to inspect the operations, was
surprised to find Elwood, who had left his father's place and become
a sub-contractor. The latter showed his hearty delight at their
meeting.
"I've been meaning to come up," he said, "but this is a busy time
for me. It's a chance I couldn't let slip, and now that I've taken hold
I must hold on. I begin to think this is the thing I was made for,
Joseph."
"I never thought of it before," Joseph answered, "and yet I'm sure
you are right. How did you hit upon it?"
"I didn't; it was Mr. Held."
"Philip?"
"Him. You know I've been hauling for the Forge, and so it turned
up by degrees, as I may say. He's at home, and, I expect, looking
for you. But how are you now, really?"
Elwood's question meant a great deal more than he knew how to
say. Suddenly, in a flash of memory, their talk of the previous year
returned to Joseph's mind; he saw his friend's true instincts and his
own blindness as never before. But he must dissemble, if possible,
with that strong, rough, kindly face before him.
"O," he said, attempting a cheerful air, "I am one of the old folks
now. You must come up—"
The recollection of Julia's words cut short the invitation upon his
lips. A sharp pang went through his heart, and the treacherous
blood crowded to his face all the more that he tried to hold it back.
"Come, and I'll show you where we're going to make the cutting,"
Elwood quietly said, taking him by the arm. Joseph fancied,
thenceforth, that there was a special kindness in his manner, and the
suspicion seemed to rankle in his mind as if he had been slighted by
his friend.
As before, to vary the tedium of his empty life, so now, to escape
from the knowledge which he found himself more and more
powerless to resist, he busied himself beyond all need with the work
of the farm. Philip had returned with his sister, he knew, but after
the meeting with Elwood he shrank with a painful dread from Philip's
heart-deep, intimate eye. Julia, however, all the more made use of
the soft spring weather to survey the social ground, and choose
where to take her stand. Joseph scarcely knew, indeed, how
extensive her operations had been, until she announced an invitation
to dine with the Hopetons, who were now in possession of the
renovated Calvert place. She enlarged, more than was necessary, on
the distinguished city position of the family, and the importance of
"cultivating" its country members. Joseph's single brief meeting with
Mr. Hopeton—who was a short, solid man, in ripe middle age, of a
thoroughly cosmopolitan, though not a remarkably intellectual stamp
—had been agreeable, and he recognized the obligation to be
neighborly. Therefore he readily accepted the invitation on his own
grounds.
When the day arrived, Julia, after spending the morning over her
toilet, came forth resplendent in rosy silk, bright and dazzling in
complexion, and with all her former grace of languid eyelids and
parted lips. The void in Joseph's heart grew wider at the sight of
her; for he perceived, as never before, her consummate skill in
assuming a false character. It seemed incredible that he should have
been so deluded. For the first time a feeling of repulsion, which was
almost disgust, came upon him as he listened to her prattle of
delight in the soft weather, and the fragrant woods, and the
blossoming orchards. Was not, also, this delight assumed? he asked
himself: false in one thing, false in all, was the fatal logic which then
and there began its torment.
The most that was possible in such a short time had been
achieved on the Calvert place. The house had been brightened,
surrounded by light, airy verandas, and the lawn and garden, thrown
into one and given into the hands of a skilful gardener, were scarcely
to be recognized. A broad, solid gravel-walk replaced the old tan-
covered path; a pretty fountain tinkled before the door; thick beds of
geranium in flower studded the turf, and veritable thickets of rose-
trees were waiting for June. Within the house, some rooms had
been thrown together, the walls richly yet harmoniously colored, and
the sumptuous furniture thus received a proper setting. In contrast
to the houses of even the wealthiest farmers, which expressed a
nicely reckoned sufficiency of comfort, the place had an air of joyous
profusion, of a wealth which delighted in itself.
Mr. Hopeton met them with the frank, offhand manner of a man of
business. His wife followed, and the two guests made a rapid
inspection of her as she came down the hall. Julia noticed that her
crocus-colored dress was high in the neck, and plainly trimmed; that
she wore no ornaments, and that the natural pallor of her
complexion had not been corrected by art. Joseph remarked the
simple grace of her movement, the large, dark, inscrutable eyes, the
smooth bands of her black hair, and the pure though somewhat
lengthened oval of her face. The gentle dignity of her manner more
than refreshed, it soothed him. She was so much younger than her
husband that Joseph involuntarily wondered how they should have
come together.
The greetings were scarcely over before Philip and Madeline Held
arrived. Julia, with the least little gush of tenderness, kissed the
latter, whom Philip then presented to Joseph for the first time. She
had the same wavy hair as her brother, but the golden hue was
deepened nearly into brown, and her eyes were a clear hazel. It was
also the same frank, firm face, but her woman's smile was so much
the sweeter as her lips were lovelier than the man's. Joseph seemed
to clasp an instant friendship in her offered hand.
There was but one other guest, who, somewhat to his surprise,
was Lucy Henderson. Julia concealed whatever she might have felt,
and made so much reference to their former meetings as might
satisfy Lucy without conveying to Mrs. Hopeton the impression of
any special intimacy. Lucy looked thin and worn, and her black silk
dress was not of the latest fashion: she seemed to be the poor
relation of the company. Joseph learned that she had taken one of
the schools in the valley, for the summer. Her manner to him was as
simple and friendly as ever, but he felt the presence of some new
element of strength and self-reliance in her nature.
His place at dinner was beside Mrs. Hopeton, while Lucy—
apparently by accident—sat upon the other side of the hostess.
Philip and the host led the conversation, confining it too exclusively
to the railroad and iron interests; but those finally languished, and
gave way to other topics in which all could take part. Joseph felt that
while the others, except Lucy and himself, were fashioned under
different aspects of life, some of which they shared in common, yet
that their seeming ease and freedom of communication touched,
here and there, some invisible limit, which they were careful not to
pass. Even Philip appeared to be beyond his reach, for the time.
The country and the people, being comparatively new to them,
naturally came to be discussed.
"Mr. Held, or Mr. Asten,—either of you know both,"—Mr. Hopeton
asked, "what are the principal points of difference between society in
the city and in the country?"
"Indeed, I know too little of the city," said Joseph.
"And I know too little of the country,—here, at least," Philip added.
"Of course the same passions and prejudices come into play
everywhere. There are circles, there are jealousies, ups and downs,
scandals, suppressions, and rehabilitations: it can't be otherwise."
"Are they not a little worse in the country," said Julia, "because—I
may ask the question here, among us—there is less refinement of
manner?"
"If the external forms are ruder," Philip resumed, "it may be an
advantage, in one sense. Hypocrisy cannot be developed into an
art."
Julia bit her lip, and was silent.
"But are the country people, hereabouts, so rough?" Mrs. Hopeton
asked. "I confess that they don't seem so to me. What do you say,
Miss Henderson?"
"Perhaps I am not an impartial witness," Lucy answered. "We care
less about what is called 'manners' than the city people. We have no
fixed rules for dress and behavior,—only we don't like any one to
differ too much from the rest of us."
"That's it!" Mr. Hopeton cried; "the tyrannical levelling sentiment
of an imperfectly developed community! Fortunately, I am beyond its
reach."
Julia's eyes sparkled: she looked across the table at Joseph, with a
triumphant air.
Philip suddenly raised his head. "How would you correct it? Simply
by resistance?" he asked.
Mr. Hopeton laughed. "I should no doubt get myself into a
hornet's-nest. No; by indifference!"
Then Madeline Held spoke. "Excuse me," she said; "but is
indifference possible, even if it were right? You seem to take the
levelling spirit for granted, without looking into its character and
causes; there must be some natural sense of justice, no matter how
imperfectly society is developed. We are members of this
community,—at least, Philip and I certainly consider ourselves so,—
and I am determined not to judge it without knowledge, or to offend
what may be only mechanical habits of thought, unless I can see a
sure advantage in doing so."
Lucy Henderson looked at the speaker with a bright, grateful face.
Joseph's eyes wandered from her to Julia, who was silent and
watchful.
"But I have no time for such conscientious studies," Mr. Hopeton
resumed. "One can be satisfied with half a dozen neighbors, and let
the mass go. Indifference, after all, is the best philosophy. What do
you say, Mr. Held?"
"Indifference!" Philip echoed. A dark flush came into his face, and
he was silent a moment. "Yes: our hearts are inconvenient
appendages. We suffer a deal from unnecessary sympathies, and
from imagining, I suppose, that others feel them as we do. These
uneasy features of society are simply the effort of nature to find
some occupation for brains otherwise idle—or empty. Teach the
people to think, and they will disappear."
Joseph stared at Philip, feeling that a secret bitterness was hidden
under his careless, mocking air. Mrs. Hopeton rose, and the company
left the table. Madeline Held had a troubled expression, but there
was an eager, singular brightness in Julia's eyes.
"Emily, let us have coffee on the veranda," said Mr. Hopeton,
leading the way. He had already half forgotten the subject of
conversation: his own expressions, in fact, had been made very
much at random, for the sole purpose of keeping up the flow of talk.
He had no very fixed views of any kind, beyond the sphere of his
business activity.
Philip, noticing the impression he had made on Joseph, drew him
to one side. "Don't seriously remember my words against me," he
said; "you were sorry to hear them, I know. All I meant was, that an
over-sensitive tenderness towards everybody is a fault. Besides, I
was provoked to answer him in his own vein."
"But, Philip!" Joseph whispered, "such words tempt me! What if
they were true?"
Philip grasped his arm with a painful force. "They never can be
true to you, Joseph," he said.
Gay and pleasant as the company seemed to be, each one felt a
secret sense of relief when it came to an end. As Joseph drove
homewards, silently recalling what had been said, Julia interrupted
his reflections with: "Well, what do you think of the Hopetons?"
"She is an interesting woman," he answered.
"But reserved; and she shows very little taste in dress. However, I
suppose you hardly noticed anything of the kind. She kept Lucy
Henderson beside her as a foil: Madeline Held would have been
damaging."
Joseph only partly guessed her meaning; it was repugnant, and he
determined to avoid its further discussion.
"Hopeton is a shrewd business man," Julia continued, "but he
cannot compare with her for shrewdness—either with her or—Philip
Held!"
"What do you mean?"
"I made a discovery before the dinner was over, which you—
innocent, unsuspecting man that you are—might have before your
eyes for years, without seeing it. Tell me now, honestly, did you
notice nothing?"
"What should I notice, beyond what was said?" he asked.
"That was the least!" she cried; "but, of course, I knew you
couldn't. And perhaps you won't believe me, when I tell you that
Philip Held,—your particular friend, your hero, for aught I know your
pattern of virtue and character, and all that is manly and noble,—
that Philip Held, I say, is furiously in love with Mrs. Hopeton!"
Joseph started as if he had been shot, and turned around with an
angry red on his brow. "Julia!" he said, "how dare you speak so of
Philip!"
She laughed. "Because I dare to speak the truth, when I see it. I
thought I should surprise you. I remembered a certain rumor I had
heard before she was married,—while she was Emily Marrable,—and
I watched them closer than they guessed. I'm certain of Philip: as
for her, she's a deep creature, and she was on her guard; but they
are near neighbors."
Joseph was thoroughly aroused and indignant. "It is your own
fancy!" he exclaimed. "You hate Philip on account of that affair with
Clementina; but you ought to have some respect for the woman
whose hospitality you have accepted!"
"Bless me! I have any quantity of respect both for her and her
furniture. By the by, Joseph, our parlor would furnish better than
hers; I have been thinking of a few changes we might make, which
would wonderfully improve the house. As for Philip, Clementina was
a fool. She'd be glad enough to have him now, but in these matters,
once gone is gone for good. Somehow, people who marry for love
very often get rich afterwards,—ourselves, for instance."
It was some time before Joseph's excitement subsided. He had
resented Julia's suspicion as dishonorable to Philip, yet he could not
banish the conjecture of its possible truth. If Philip's affected
cynicism had tempted him, Julia's unblushing assumption of the
existence of a passion which was forbidden, and therefore positively
guilty, seemed to stain the pure texture of his nature. The lightness
with which she spoke of the matter was even more abhorrent to him
than the assertion itself; the malicious satisfaction in the tones of
her voice had not escaped his ear.
"Julia," he said, just before they reached home, "do not mention
your fancy to another soul than me. It would reflect discredit on
you."
"You are innocent," she answered. "And you are not
complimentary. If I have any remarkable quality, it is tact. Whenever
I speak, I shall know the effect before-hand; even pa, with all his
official experience, is no match for me in this line. I see what the
Hopetons are after, and I mean to show them that we were first in
the field. Don't be concerned, you good, excitable creature, you are
no match for such well-drilled people. Let me alone, and before the
summer is over we will give the law to the neighborhood!"
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
juttizabana
 
PDF
Markov Chain Monte Carlo Innovations And Applications W S Kendall
uqrrjsudrd750
 
PDF
Advances in Imaging and Electron Physics 127 1st Edition Peter W. Hawkes (Eds.)
lakoulyaho
 
PDF
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
grpdkcz3344
 
PDF
Research Proposal
Komlan Atitey
 
PDF
Advances in Imaging and Electron Physics Vol 117 1st Edition Peter W. Hawkes ...
jusumuhame
 
DOC
danreport.doc
butest
 
PDF
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
hildtzizak7p
 
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
juttizabana
 
Markov Chain Monte Carlo Innovations And Applications W S Kendall
uqrrjsudrd750
 
Advances in Imaging and Electron Physics 127 1st Edition Peter W. Hawkes (Eds.)
lakoulyaho
 
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
grpdkcz3344
 
Research Proposal
Komlan Atitey
 
Advances in Imaging and Electron Physics Vol 117 1st Edition Peter W. Hawkes ...
jusumuhame
 
danreport.doc
butest
 
Mathematical Models and Methods for Real World Systems 1st Edition K.M. Furati
hildtzizak7p
 

Similar to Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad (20)

PDF
Advances in Imaging and Electron Physics Vol 117 1st Edition Peter W. Hawkes ...
fztsfeqq5186
 
PPTX
Computational Giants_nhom.pptx
ThAnhonc
 
PDF
Theory and Computation of Tensors Multi Dimensional Arrays 1st Edition Yimin Wei
hmqngstlf508
 
PDF
Theory and Computation of Tensors Multi Dimensional Arrays 1st Edition Yimin Wei
andeevxhajaj
 
PDF
Numerical Methods and Optimization An Introduction 1st Edition Pardalos
mematineslyn
 
PDF
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
jabedbuin
 
PDF
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
kareljiambo
 
PDF
Information Theory And Statistical Learning 2009th Edition Frank Emmertstreib
hajziadome
 
PPTX
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
PDF
The Variational Bayes Method In Signal Processing Signals And Communication T...
jahibtb709
 
PDF
The Variational Bayes Method In Signal Processing Signals And Communication T...
daqqasalant
 
PDF
Tesis_Eugenia_Koblents
Eugenia Koblents Lapteva
 
PDF
Matrixexponential Distributions In Applied Probability 1st Edition Mogens Bladt
tulliabuncic
 
PDF
Theory and Applications of Monte Carlo Simulations by Chan V. (Ed.).pdf
ssuser941d48
 
PDF
Multivariate Approximation and Applications 1st Edition N. Dyn
budhuradot
 
PDF
Technical Area: Machine Learning and Pattern Recognition
butest
 
PDF
Phylogeny Discrete and Random Processes in Evolution 1st Edition Mike Steel
motinipoot
 
PDF
Digital Image Processing Mathematical And Computational Methods Jonathan M Bl...
sibsasladak
 
PDF
Advances In Applied Mathematics And Global Optimization In Honor Of Gilbert S...
wrobeldijvar3
 
PDF
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
romaipoteah
 
Advances in Imaging and Electron Physics Vol 117 1st Edition Peter W. Hawkes ...
fztsfeqq5186
 
Computational Giants_nhom.pptx
ThAnhonc
 
Theory and Computation of Tensors Multi Dimensional Arrays 1st Edition Yimin Wei
hmqngstlf508
 
Theory and Computation of Tensors Multi Dimensional Arrays 1st Edition Yimin Wei
andeevxhajaj
 
Numerical Methods and Optimization An Introduction 1st Edition Pardalos
mematineslyn
 
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
jabedbuin
 
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
kareljiambo
 
Information Theory And Statistical Learning 2009th Edition Frank Emmertstreib
hajziadome
 
Introduction to Mean Field Theory(MFT).pptx
zeex60
 
The Variational Bayes Method In Signal Processing Signals And Communication T...
jahibtb709
 
The Variational Bayes Method In Signal Processing Signals And Communication T...
daqqasalant
 
Tesis_Eugenia_Koblents
Eugenia Koblents Lapteva
 
Matrixexponential Distributions In Applied Probability 1st Edition Mogens Bladt
tulliabuncic
 
Theory and Applications of Monte Carlo Simulations by Chan V. (Ed.).pdf
ssuser941d48
 
Multivariate Approximation and Applications 1st Edition N. Dyn
budhuradot
 
Technical Area: Machine Learning and Pattern Recognition
butest
 
Phylogeny Discrete and Random Processes in Evolution 1st Edition Mike Steel
motinipoot
 
Digital Image Processing Mathematical And Computational Methods Jonathan M Bl...
sibsasladak
 
Advances In Applied Mathematics And Global Optimization In Honor Of Gilbert S...
wrobeldijvar3
 
Stationary Stochastic Processes Theory and Applications 1st Edition Georg Lin...
romaipoteah
 
Ad

Recently uploaded (20)

PPTX
Care of patients with elImination deviation.pptx
AneetaSharma15
 
PDF
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
PDF
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
PPTX
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
PDF
Study Material and notes for Women Empowerment
ComputerScienceSACWC
 
PDF
Phylum Arthropoda: Characteristics and Classification, Entomology Lecture
Miraj Khan
 
PDF
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
PDF
Types of Literary Text: Poetry and Prose
kaelandreabibit
 
PDF
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
PPTX
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PPTX
PREVENTIVE PEDIATRIC. pptx
AneetaSharma15
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PDF
1.Natural-Resources-and-Their-Use.ppt pdf /8th class social science Exploring...
Sandeep Swamy
 
DOCX
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
CDH. pptx
AneetaSharma15
 
PDF
Sunset Boulevard Student Revision Booklet
jpinnuck
 
PPTX
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
PPTX
Trends in pediatric nursing .pptx
AneetaSharma15
 
DOCX
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Care of patients with elImination deviation.pptx
AneetaSharma15
 
The Minister of Tourism, Culture and Creative Arts, Abla Dzifa Gomashie has e...
nservice241
 
Antianginal agents, Definition, Classification, MOA.pdf
Prerana Jadhav
 
How to Manage Leads in Odoo 18 CRM - Odoo Slides
Celine George
 
Study Material and notes for Women Empowerment
ComputerScienceSACWC
 
Phylum Arthropoda: Characteristics and Classification, Entomology Lecture
Miraj Khan
 
Presentation of the MIPLM subject matter expert Erdem Kaya
MIPLM
 
Types of Literary Text: Poetry and Prose
kaelandreabibit
 
2.Reshaping-Indias-Political-Map.ppt/pdf/8th class social science Exploring S...
Sandeep Swamy
 
Measures_of_location_-_Averages_and__percentiles_by_DR SURYA K.pptx
Surya Ganesh
 
PREVENTIVE PEDIATRIC. pptx
AneetaSharma15
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
1.Natural-Resources-and-Their-Use.ppt pdf /8th class social science Exploring...
Sandeep Swamy
 
Action Plan_ARAL PROGRAM_ STAND ALONE SHS.docx
Levenmartlacuna1
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
CDH. pptx
AneetaSharma15
 
Sunset Boulevard Student Revision Booklet
jpinnuck
 
TEF & EA Bsc Nursing 5th sem.....BBBpptx
AneetaSharma15
 
Trends in pediatric nursing .pptx
AneetaSharma15
 
SAROCES Action-Plan FOR ARAL PROGRAM IN DEPED
Levenmartlacuna1
 
Ad

Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad

  • 1. Advanced Mean Field Methods Theory And Practice Manfred Opper David Saad download https://ebookbell.com/product/advanced-mean-field-methods-theory- and-practice-manfred-opper-david-saad-56636516 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Advances In Dynamic And Mean Field Games Theory Applications And Numerical Methods 1st Edition Joseph Apaloo https://ebookbell.com/product/advances-in-dynamic-and-mean-field- games-theory-applications-and-numerical-methods-1st-edition-joseph- apaloo-6842470 Mean Field Models For Spin Glasses Volume Ii Advanced Replicasymmetry And Low Temperature 2nd Edition Michel Talagrand Auth https://ebookbell.com/product/mean-field-models-for-spin-glasses- volume-ii-advanced-replicasymmetry-and-low-temperature-2nd-edition- michel-talagrand-auth-2451920 Complete Minna No Nihongo All Textbooks And Workbooks With All Audio Sound Files Everything Pack Beginner Advanced Intermediate Chokyu Shokyu For Japanese Learners I Mean Not Learners Who Are Japanese But Learning Japanese Phew https://ebookbell.com/product/complete-minna-no-nihongo-all-textbooks- and-workbooks-with-all-audio-sound-files-everything-pack-beginner- advanced-intermediate-chokyu-shokyu-for-japanese-learners-i-mean-not- learners-who-are-japanese-but-learning-japanese-phew-50883074 Advanced Manmachine Interaction Fundamentals And Implementation Signals And Communication Technology 1st Edition Karlfriedrich Kraiss Editor https://ebookbell.com/product/advanced-manmachine-interaction- fundamentals-and-implementation-signals-and-communication- technology-1st-edition-karlfriedrich-kraiss-editor-2539118
  • 3. Fiddle Studio Book 2 Fiddling For The Advanced Beginner Megan Beller Beller https://ebookbell.com/product/fiddle-studio-book-2-fiddling-for-the- advanced-beginner-megan-beller-beller-26380280 Advanced Technologies For Meat Processing Second Edition 2nd Ed Fidel Toldr https://ebookbell.com/product/advanced-technologies-for-meat- processing-second-edition-2nd-ed-fidel-toldr-6837276 Advanced Technologies For Meat Processing 1st Edition Leo Ml Nollet https://ebookbell.com/product/advanced-technologies-for-meat- processing-1st-edition-leo-ml-nollet-1199874 Rational Design Of Nonprecious Metal Oxide Catalysts By Means Of Advanced Synthetic And Promotional Routes Konsolakis M https://ebookbell.com/product/rational-design-of-nonprecious-metal- oxide-catalysts-by-means-of-advanced-synthetic-and-promotional-routes- konsolakis-m-48868452 Rational Design Of Nonprecious Metal Oxide Catalysts By Means Of Advanced Synthetic And Promotional Routes Michalis Konsolakis https://ebookbell.com/product/rational-design-of-nonprecious-metal- oxide-catalysts-by-means-of-advanced-synthetic-and-promotional-routes- michalis-konsolakis-54700876
  • 6. Neural Information Processing Series Michael I. Jordan, Sara I. Solla Advances in Large Margin Classifiers Alexander J. Smola, Peter 1. Bartlett, Bernhard Scholkkopf, and Dale Schuurmans, eds., 2000 Advanced Mean Field Methods: Theory and Practice Manfred Opper and David Saad, eds., 2001
  • 7. Advanced Mean Field Methods Theory and Practice Edited by Manfred Opper and David Saad The MIT Press Cambridge, Massachusetts London, England
  • 8. © 2001 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Library of Congress Cataloging-in-Publication Data Advanced mean field methods : theory and practice/edited by Manfred Opper and David Saad p. cm.-(Neural Information Processing Series) Includes bibliographical references. ISBN 0-262-15054-9 (alk. paper) I. Mean field theory II. Opper, Manfred. III. Saad, David. QC174.85.M43 A38 2001 530.15'95-dc21 00-053322
  • 9. CONTENTS Series Foreword Foreword Contributors Acknowledgments 1 2 3 4 Introduction Manfred Opper and David Saad From Naive Mean Field Theory to the TAP Equations Manfred Opper and Ole Winther An Idiosyncratic Journey Beyond Mean Field Theory Jonathan S. Yedidia Mean Field Theory for Graphical Models Hilbert J. Kappen and Wim J. Wiegerinck 5 The TAP Approach to Intensive and Extensive 6 7 Connectivity Systems Yoshiyuki Kabashima and David Saad TAP For Parity Check Error Correcting Codes David Saad, Yoshiyuki Kabashima and Renato Vicente Adaptive TAP Equations Manfred Opper and Ole Winther 8 Mean-field Theory of Learning: From Dynamics to 9 Statics K. Y. Michael Wong, S. Li and Peixun Luo Saddle-point Methods for Intractable Graphical Models Fernando J. Pineda, Cheryl Resch and I-Jeng Wang 10 Tutorial on Variational Approximation Methods Tommi S. Jaakkola 11 Graphical Models and Variational Methods Zoubin Ghahramani and Matthew J. Beal 12 Some Examples of Recursive Variational Approximations for Bayesian Inference K. Humphreys and D.M. Titterington 13 Tractable Approximate Belief Propagation David Barber vii viii xi xiv 1 7 21 37 51 67 85 99 119 129 161 179 197
  • 10. vi Contents 14 The Attenuated Max-Product Algorithm 213 Brendan J. Frey and Ralf Koetter 15 Comparing the Mean Field Method and Belief Propagation for Approximate Inference in MRFs 229 Yair Weiss 16 Information Geometry of a-Projection in Mean Field Approximation 241 Shun-ichi Amari, Shiro Ikeda and Hidetoshi Shimokawa 17 Information Geometry of Mean-Field Approximation 259 Toshiyuki Tanaka
  • 11. SERIES FOREWORD The yearly Neural Information Processing Systems (NIPS) workshops bring to­ gether scientists wirh broadly varying backgrounds in statistics, mathematics, com­ puter science, physics, electrical engineering, neuroscience, and cognitive science, unified by a common desire to develop novel computational and statistical strate­ gies for information processing, and to understand the mechanisms for information processing in the brain. As opposed to conferences, these workshops maintain a flexible format that both allows and encourages the presentation and discussion of work in progress, and thus serve as an incubator for the development of important new ideas in this rapidly evolving field. The Series Editors, in consultation with workshop organizers and members of the NIPS Foundation Board, select specific workshop topics on the basis of sci­ entific excellence, intellectual breadth, and technical impact. Collections of papers chosen and edited by the organizers of specific workshops are build around ped­ agogical introductory chapters, while research monographs provide comprehensive descriptions of workshop-related topics, to create a series of books that provides a timely, authoritative account of the latest developments in the exciting field of neural computation. Michael I. Jordan, Sara I. Solla
  • 12. FOREWORD The links between statistical physics and the information sciences-including com­ puter science, statistics, and communication theory-have grown stronger in recent years, as the needs of applications have increasingly led researchers in the informa­ tion sciences towards the study of large-scale, highly-coupled probabilistic systems that are reminiscent of models in statistical physics. One useful link is the class of Markov Chain Monte Carlo (MCMC) methods, sampling-based algorithms whose roots lie in the simulation of gases and condensed matter, but whose appealing gen­ erality and simplicity of implementation have sparked new applications throughout the information sciences. Another source of links, currently undergoing rapid devel­ opment, is the class of mean-field methods that are the topic of this book. Mean­ field methods aim to solve many of the same problems as are addressed by MCMC methods, but do so using different conceptual and mathematical tools. Mean-field methods are deterministic methods, making use of tools such as Taylor expansions and convex relaxations to approximate or bound quantities of interest. While the analysis of MCMC methods reposes on the theory of Markov chains and stochastic matrices, mean-field methods make links to optimization theory and perburbation theory. Underlying much of the heightened interest in these links between statistical physics and the information sciences is the development (in the latter field) of a general framework for associating joint probability distributions with graphs, and for exploiting the structure of the graph in the computation of marginal probabilities and expectations. Probabilistic graphical models are graphs-directed or undirected-annotated with functions defined on local clusters of nodes that when taken together define families of joint probability distributions on the graph. Not only are the classical models of statistical physics instances of graphical models (generally involving undirected graphs), but many applied probabilistic models with no obvious connection to physics are graphical models as well-examples include phylogenetic trees in genetics, diagnostic systems in medicine, unsupervised learning models in machine learning, and error-control codes in information theory. The availability of the general framework has made it possible for ideas to flow more readily between these fields. In physics one of the principal applications of mean-field methods is the predic­ tion of "phase transitions", discontinuities in aggregate properties of a system under the scaling of one or more parameters associated with the system. A physicist read­ ing the current book may thus be surprised by the relatively infrequent occurrence of the term "phase transition". In the applications to the information sciences, it is often the values of the "microscopic" variables that are of most interest, while the "macroscopic" properties of the system are often of secondary interest. Thus in the genetics application we are interested in the genotype of specific individuals; in the diagnostic applications our interest is in the probability of specific diseases; and in error-control coding we wish to recover the bits in the transmitted message. Moreover, in many of these applications we are interested in a specific graph, whose
  • 13. Foreword ix parameters are determined by statistical methods, by a domain expert or by a de­ signer, and it is a matter of secondary interest as to how aggregate properties of the probability distribution would change in some hypothetical alternative graph in which certain parameters have been scaled. This is not to say that aggregate properties of probability distributions are not of interest; indeed they are key to understanding the mean-field approach. The calculation of the probability distribution of any given "microscopic" variable-the marginal probability of a node in the graph-is an aggregation operation, requiring summing or integrating the joint probability with respect to all other variables. In statistical terms one is calculating a "log likelihood"; the physics terminology is the "free energy". In the computational framework referred to above one attempts to exploit the constraints imposed by the graphical structure to compute these quantities efficiently, essentially using the missing edges in the graph to manage the proliferation of intermediate terms that arise in computing multiple sums or integrals. This approach has been successful in many applied problems, principally involving graphs in the form of trees or chains. For more general graphs, however, a combinatorial explosion often rises up to slay any attempt to calculate marginal probabilities exactly. Unfortunately, it is precisely these graphs that are not in the form of trees and chains that are on the research frontier in many applied fields. New ideas are needed to cope with these graphs, and recent empirical results have suggested mean-field and related methods as candidates. Mean-field methods take a more numerical approach to calculations in graph­ ical models. There are several ways to understand mean-field methods, and the current book provides excellent coverage of all of the principal perspectives. One major theme is that of "relaxation", an idea familiar from modern optimization theory. Rather than computing a specific probability distribution, one relaxes the constraints defining the probability distribution, obtaining an optimization problem in which the solution to the original problem is the (unique) optimum. Relaxing constraints involves introducing Lagrange multipliers, and algorithms can be de­ veloped in which the original, "primal" problem is solved via "dual" relationships among the Lagrangian variables. This optimization perspective is important to understanding the computational consequences of adopting the physics framework. In particular, in the physics framework the free energy takes a mathematical form in which constraints are readily imposed and readily "relaxed". Note also that the physics framework permits expressing the free energy as the sum of two terms-the "average energy" and the "entropy". Computational methods can be developed that are geared to the specific mathematical forms taken by these terms. The optimization perspective that mean-field theory brings to the table is useful in another way. In particular, the graphical models studied in the information sciences are often not fully determined by a prior scientific theory, but are viewed as statistical models that are to be fit to observed data. Fitting a model to data generally involves some form of optimization-in the simplest setting one maximizes the log likelihood with respect to the model parameters. As we have
  • 14. x Foreword discussed, the mean-field approach naturally treats the log likelihood (free energy) as a parameterized function to be optimized, and it might be expected that this approach would therefore extend readily to likelihood-based statistical methods. Indeed, the simplest mean-field methods yield a lower bound on the log likelihood, and one can maximize this lower bound as a surrogate for the (generally intractable) maximization of the log likelihood. While all of these arguments may have appeal to the physicist, particularly the physicist contemplating unemployment in the modern "information economy", for the information scientist there is room for doubt. A survey of the models studied by the physicists reveal properties that diverge from the needs of the information scien­ tist. Statistical physical models are often homogeneous-the parameters linking the nodes are the same everywhere in the graph. More generally, the physical models choose parameters from distributions ("spin-glass models") but these distributions are the same everywhere in the graph. The models allow "field terms" that are equivalent to "observed data" in the statistical setting, but often these field terms are assumed equal. Various graphical symmetries are often invoked. Some models assume infinite-ranged connections. All of these assumptions seem rather far from the highly inhomogeneous, irregular setting of models in settings such as genetics, medical diagnosis, unsupervised learning or error-control coding. While it is possible that some of these assumptions are required for mean­ field methods to succeed, there are reasons to believe that the scope of mean-field methods extends beyond the restrictive physical setting that engendered them. First, as reported by several of the papers in this volume, there have been a number of empirical successes involving mean-field methods, in problems far from the physics setting. Second, many of the assumptions have been imposed with the goal of obtaining analytical results, particularly as part of the hunt for phase transitions. Viewed as a computational methodology, mean-field theory may not require such strong symmetries or homogeneities. Third, there is reason to believe that the exact calculation techniques and mean-field techniques exploit complementary aspects of probabilistic graphical model structure, and that hybrid techniques may allow strong interactions to be removed using exact calculations, revealing more homogeneous "residuals" that can be handled via mean-field algorithms. Considerations such as these form the principal subject matter of the book and are addressed in many of its chapters. While the book does an admirable job of covering the basics of mean-field theory in the classical setting of Ising and related models, the main thrust is the detailed consideration of the new links between computation and general probabilistic modeling that mean-field methods promise to expose. This is an exciting and timely topic, and the current book provides the best treatment yet available. Michael 1. Jordan Berkeley
  • 15. CONTRIBUTORS Shun-Ichi Amari RIKEN Brain Science Institute, Hirosawa, 2-1, Wako-shi, Saitama, 351-0198, Japan. [email protected] David Barber The Neural Computing Research Group, School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK. [email protected] Matthew J. Beal Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London WC1N 3AR, UK. [email protected] Brendan J. Frey Computer Science, University of Waterloo, Davis Centre, Waterloo, Ontario N2L 3G1, Canada. [email protected] Zoubin Ghahramani Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London WC1N 3AR, UK. [email protected] Keith Humphreys Stockholm University/KTH, Department of Computer and Systems Sciences, Electrum 230, SE-164 40 Kista, Sweden. [email protected] Shiro Ikeda PRESTO, JST, Lab. for Mathematical Neuroscience, BSI, RIKEN Hirosawa 2-1, Wako-shi, Saitama, 351-0198 Japan. [email protected] Tommi S. Jaakkola Department of Computer Science and Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. [email protected] Yoshiyuki Kabashima Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology, Yokohama 2268502, Japan. [email protected] Bert Kappen Foundation for Neural Networks (SNN), Department of Medical Physics and Biophysics, University of Nijmegen, Geert Grooteplein 21, CPK1 231, NL 6525 EZ Nijmegen. The Netherlands. [email protected]
  • 16. xii Ralf Koetter University of Illinois at Urbana-Champaign, 115 Computing Systems Research Lab, 1308 W. Main, Urbana, IL 61801 USA. [email protected] Song Li Department of Physics, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. [email protected] Peixun Luo Department of Physics, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. [email protected] Manfred Opper The Neural Computing Research Group, School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK. [email protected] Fernando J. Pineda Research and Technology Development Center The Johns Hopkins University Applied Physics Laboratory Johns Hopkins Rd. Laurel, MD 20723-6099, USA. [email protected] Contributors Cheryl Resch Research and Technology Development Center The Johns Hopkins University Applied Physics Laboratory Johns Hopkins Rd. Laurel, MD 20723-6099, USA. [email protected] David Saad The Neural Computing Research Group, School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK. [email protected] Hidetoshi Shimokawa Faculty of Engineering, 7-3-1, Hongo, Bunkyo-ku, Tokyo 113-8656, Japan. [email protected] D.M. Titterington Department of Statistics, University of Glasgow, Glasgow G12 8QQ, Scotland, UK. [email protected] Toshiyuki Tanaka Department of Electronics and Information Engineering, Faculty of Engineering, Tokyo Metropolitan University, Circuits and Systems Engineering Laboratory, 1-1 Minami Oosawa, Hachioji, Tokyo, 192-0397 Japan. [email protected]
  • 17. Contributors Renato Vicente The Neural Computing Research Group, School of Engineering and Applied Science, Aston University, Birmingham B4 7ET, UK. [email protected] I-Jeng Wang Research and Technology Development Center The Johns Hopkins University Applied Physics Laboratory Johns Hopkins Rd. Laurel, MD 20723-6099, USA. [email protected] Yair Weiss Computer Science Division UC Berkeley, 485 Soda Hall Berkeley, CA 94720-1776, USA. [email protected] Wim Wiegerinck Foundation for Neural Networks (SNN), Department of Medical Physics and Biophysics, University of Nijmegen, Geert Grooteplein 21, CPK1 231, NL 6525 EZ Nijmegen. The Netherlands. [email protected] Ole Winther Department of Theoretical Physics, Lund University, Slvegatan 14A, S - 223 62 Lund, Sweden. [email protected] K. Y. Michael Wong Department of Physics, xiii The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong. [email protected] Jonathan S. Yedidia MERL - Mitsubishi Electric Research Laboratories, Inc. 201 Broadway, 8th Floor, Cambridge, MA 02139, USA. [email protected]
  • 18. ACKNOWLEDGMENTS We would like to thank Wei Lee Woon for helping us with preparing the manuscript for publication and the participants of the post-NIPS workshop on Advanced Mean Field Methods for their contribution to this book. Finally, we would like to thank Julianne and Christiane, Felix, Jonathan and Lior for their tolerance during this very busy summer.
  • 19. 1 Introduction Manfred Opper and David Saad A major problem in modern probabilistic modeling is the huge computational com­ plexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Take, for instance, probabilistic data models such as Bayesian belief networks which have found widespread applications in artificial intelligence and neural com­ putation. These models explain observed (visible) data by a set of hidden random variables using the joint distribution of both sets of variables. Statistical inference about the unknown hidden variables requires computing their posterior expectation given the observations. Model selection is often based on maximizing the marginal distribution of the observed data with respect to the model parameters. Since ex­ act calculation of both quantities becomes infeasible when the number of hidden variables is large and also Monte Carlo sampling techniques may reach their limits, there is growing interest in methods which allow for efficient approximations. One of the simplest and most prominent approximations is based on the so­ called Mean Field (MF) method which has a long history in statistical physics. In this approach, the mutual influence between random variables is replaced by an effective field, which acts independently on each random variable. In its simplest version, this can be formulated as an approximation of the true distribution by a factorizable one. A variational optimization of such products results in a closed set of nonlinear equations for their expected values, which usually can be solved in a time that only grows polynomially in the number of variables. Presently, there is an increasing research activity aimed at developing improved approximations which take into account part of the neglected correlations between random variables, and at exploring novel fields of applications for such advanced mean field methods. Significant progress has been made by researchers coming from a variety of scientific backgrounds like statistical physics, computer science and mathematical statistics. These fields often differ in their scientific terminologies, intuitions and biases. For instance, physicists often prefer typically good approximations (with less clear worst case behavior) over the rigorous results favored by computer scientists. Since such 'cultural' differences may slow down the exchange of ideas we organized the NIPS workshop on Advanced Mean Field Methods in 1999 to encourage further interactions and cross fertilization between fields. The workshop has revealed a variety of deep connections between the different approaches (like that of the Bethe approximation and belief propagation techniques) which has already lead to the development of a novel algorithm. This book is a collection of the presentations given at the workshop together with a few other related invited papers. The following problems and questions are among the central topics discussed in this book: • Advanced MF approaches like the TAP (Thouless, Anderson, Palmer) method have been originally derived for very specific models in statistical physics. How can we expand their theoretical foundations in order to make the methods widely
  • 20. 2 Manfred Opper and David Saad applicable within the field of probabilistic data models? • What are the precise relations between the statistical physics approaches and other methods which have been developed in the computer science community, like the belief propagation technique? Can we use this knowledge to develop novel and even more powerful inference techniques by unifying and combining these approaches? • The quality of the MF approximation is, in general, unknown. Can we predict when a specific MF approximation will work better than another? Are there systematical ways to improve these approximations such that our confidence in the results will increase? • What are the promising application areas for advanced mean field approaches and what are the principled ways of solving the mean field equations when the structure of the dependencies between random variables is sufficiently complicated? The chapters of this book can be grouped into two parts. While chapters 2-9 focus mainly on approaches developed in the statistical physics community, chapters 10-17 are more biased towards ideas originated in the computer science/statistics communities. Chapters 2 and 3 can serve as introductions to the main ideas behind the statistical physics approaches. Chapter 2 explains three different types of MF field approximations, demonstrated on a simple Boltzmann machine like Ising model. Naive mean field equations are derived by the variational method and by a field theoretic approach. In the latter, high dimensional sums are transformed into integrals over auxiliary variables which are approximated by Laplace's method. The TAP MF equations account for correlations between random variables by an approximate computation of the reaction of all variables to the deletion of a single variable from the system. Chapter 3 explains the role of the statistical physics free energy within the framework of probabilistic models and shows how different approximations of the free energy lead to various advanced MF methods. In this way, the naive MF theory, the TAP approach and the Bethe approximation are derived as the first terms in two different systematic expansions of the free energy, the Plefka expansion and the cluster variation method of Kikuchi. Remarkably, the minima of the Bethe free energy are identified as the fixed points of the celebrated belief propagation algorithm for inference in graphical models. This connection opens new ways for systematic improvements of this algorithm. The following 5 chapters present various generalizations and applications of TAP-like mean field approaches. A novel derivation of the TAP approximation is presented in chapter 4. It is based on a truncation of a power series expansion of marginal distributions with respect to the couplings between random variables. This derivation opens up new fields of applications for the TAP approach, like to graphical models with general types of interactions. It also allows to treat stochastic networks with asymmetric couplings for which a closed form of the stationary probability distribution is not available. Numerical simulations for graphical models and comparisons with simple MF theory demonstrate the significance of the TAP
  • 21. Introduction 3 method. Chapter 5 addresses the problem of deriving the correct form of the TAP equa­ tions for models where the interactions between random variables have significant correlations. It also bridges the gap between the TAP approach and the belief prop­ agation method. Demonstrating the method on the Hopfield model, the original set of random variables is augmented by an auxiliary set such that the mutual depen­ dencies are weak enough to justify a tree approximation. The equations derived for the corresponding set of conditional probabilities on this tree reduce to the well known TAP equations for the Hopfield model in the limit of extensive connectivity. Chapter 6 employs the framework presented in chapter 5 to investigate decoding techniques within the context of low-density parity-check error-correcting codes. It shows the similarity in the decoding dynamic, obtained using the TAP approach, and the method of belief propagation. Numerical experiments examine the efficacy of the method as a decoding algorithm by comparing the results obtained with the analytical solutions. Chapter 7 introduces a method for adapting the TAP approach to a concrete set of data, providing another answer to the problem raised in chapter 5. The method avoids the assumptions, usually made in the cavity derivation of the TAP equations, about the distribution of interactions between random variables. By using the cavity method together with linear response arguments, an extra set of data dependent equations for the reaction terms is obtained. Applications of the adaptive TAP approximation to the Hopfield model as well as to Bayesian classification are presented. Chapter 8 presents a TAP - like mean field theory to treat stochastic dynamical equations. The cavity method is used to derive dynamical mean field equations for computing the temporal development of averages. The method is applied to the average case performance of stochastic learning algorithms for neural networks. It is shown how static averages over the steady state distribution are obtained in the infinite time limit. The chapter sheds more light on the meaning and on the basic assumptions behind the cavity approach by showing how the formalism must be altered in the case of a rugged energy landscape. Chapter 9 applies the field theoretic mean field approach to computing the marginal probability of the visible variables in graphical models. In this method, the relevant random variables are decoupled using auxiliary integration variables. The summations over the huge number of values of the random variables can now be performed exactly. The remaining integrals are performed by a quadratic expansion around the saddlepoint. As shown for two examples of Bayesian belief networks, this approximation can yield a dramatic improvement over the results, achieved by applying the variational method using a factorized distribution. Chapter 10 presents a general introduction to the variational method and its application to inference in probabilistic models. By reformulating inference tasks as optimization problems, tractable approximations can be obtained by suitable restriction of the solution space. The standard MF method is generalized by min­ imizing the Kullback-Leibler divergence using factorized variational distributions, where each factor contains a tractable substructure of variables. A different way
  • 22. 4 Manfred Opper and David Saad of decoupling random variables is achieved by using variational transformations for conditional probabilities. The chapter discusses various fields of applications of these ideas. Chapters 11 and 12 discuss applications and modifications of the variational method for complex probabilistic models with hidden states. Chapter 11 shows how a factorial approximation to the distribution of hidden states can be used to obtain a tractable approximation for the E - step of the EM algorithm for parameters estimation. This idea can be generalized to model estimation in a Bayesian framework, where a factorization of the joint posterior of parameters and hidden variables enables an approximate optimization of the Bayesian evidence. The occurring variational problems can be solved efficiently by a Bayesian generalization of the EM algorithm for exponential models and their conjugate priors. The method is demonstrated on mixtures of factor analyzers and state-space models. Chapter 12 reconsiders the Bayesian inference problem with hidden variables discussed in the previous chapter. It offers alternative approaches for approximate factorizations of posterior distributions in cases, when the standard variational method becomes computationally infeasible. In these recursive procedures, a factor­ ized approximation to the posterior is updated any time a new observation arrives. A recursive variational optimization is compared with the probabilistic editor which recursively matches moments of marginal posterior distributions. The Quasi-Bayes method replaces hidden variables at any update of the posterior by their approxi­ mate posterior expectations based on the already collected data. The probabilistic editor outperforms the other two strategies in simulations of a toy neural network and a simple hidden Markov model. Chapter 13 gives an introduction to belief propagation (BP) for directed and undirected graphical models. BP is an inference technique which is exact for graphs with a tree structure. However, the method may become intractable in densely connected directed graphs. To cope with the computational complexity, an integral transformation of the intractable sums together with a saddlepoint approximation, similar to the field theoretic MF approach discussed in chapter 9, is introduced. Simulations for a graphical model which allows representations by both directed and undirected graphs, show that the method outperforms a simple variational MF approximation and undirected BP. Chapters 14 and 15 investigate the performance of BP inference algorithms when applied to probabilistic models with loopy graphs. In such a case, exact inference can no longer be guaranteed. Chapter 14 introduces a modification of the max-product algorithm designed to compute the maximum posterior probability (MAP). By properly attenuating the BP messages, the algorithm can properly deal with the dependencies introduced by the cycles in the graph. It is shown rigorously for codes on graphs that in this way the exact global MAP configuration of the random variables is reached, if the algorithm converges. The question of when such an algorithm converges, remains open. Also chapter 15 demonstrates the importance of understanding the actual dynamics of advanced MF inference algorithms. It compares the performance of BP to the simple MF method on markov random field problems. The fixed points of
  • 23. Introduction 5 both algorithms coincide with zero gradient solutions of different approximate free energies (see also chapter 3). For a variety of numerical examples BP outperforms the simple MF method. Remarkably, one finds that BP often converges to a configuration which is close to the global minimum of the simple MF free energy, whereas the simple MF algorithm performs worse by getting trapped in local minima. Chapters 16 and 17 conclude the book by discussing mean field approaches from the viewpoint of the information geometric approach to statistical inference. Understanding the invariant geometric properties of MF approximations may help to identify new ways of assessing and improving their accuracy. Chapter 16 introduces a one parameter family of non-symmetric distance mea­ sures between probability distributions which are demonstrated for the exponential family of Boltzmann machines. An expansion of these a-divergences for neighboring distributions involves the Fisher information, which gives the manifold of distribu­ tions a unique invariant metric. Orthogonal projections of a multivariate distribu­ tion onto the manifold of factorized distributions interpolate between the desired intractable exact marginal distribution (a = -1) for which there is a unique solu­ tion, and the naive MF approximation (a = 1) for which many solutions often exist. This framework suggests a novel approximation scheme based on an expansion of the intractable projections in powers of a around the tractable point a = 1. An al­ ternative way to approximate the intractable a projections, based on a power series expansion in the coupling matrix, leads to a novel derivation of the TAP equations and their generalization to arbitrary a. In chapter 17, the ideas of information geometry are shown to provide a unified treatment of different mean field methods and shed light on the theoretical basis of the variational approach. Variational derivation of the naive MF method may be understood as a projection of the true distribution onto the manifold of factorized distributions. A family of manifolds is introduced which is controlled by a single parameter that interpolates between the fully factorized distributions and the manifold of general distributions, which includes the intractable true distribution. The desired full variation can be approached perturbatively by an expansion with respect to this parameter. In this way, a new interpretation of the Plefka expansion for the TAP equations emerges. The geometric approach is extended to the variational Bayes method and to the variational approximation to the EM algorithm which is understood as the alternation of two projection types. This book is aimed at providing a fairly comprehensive overview of recent de­ velopments in the area of advanced mean field theories, examining their theoretical background, links to other approaches and possible novel applications. The chapters were designed to contain sufficiently detailed material to enable the non-specialist reader to follow the main ideas with minimal background reading.
  • 24. 2 From Naive Mean Field Theory to the TAP Equations Manfred Opper and Ole Winther We give a basic introduction to three different MF approaches which will be discussed on a more advanced level in other chapters of this book. We discuss the Variational, the Field Theoretic and the TAP approaches and their applications to a Boltzmann machine type of Ising model. 1 Introduction Mean field (MF) methods provide tractable approximations for the computation of high dimensional sums and integrals in probabilistic models. By neglecting certain dependencies between random variables, a closed set of equations for the expected values of these variables is derived which often can be solved in a time that only grows polynomially in the number of variables. The method has its origin in Statistical Physics where the thermal fluctuations of particles are governed by high dimensional probability distributions. In the field of probabilistic modeling, the MF approximation is often identified as a special kind of the variational approach in which the true intractable distribution is approximated by an optimal factorized one. On the other hand, a variety of other approximations with a "mean field" flavor are known in the Statistical Physics community. However, compared to the variational approach the derivation of these other techniques seem to be less "clean". For instance, the "field theoretic" MF approaches may lack a clearcut probabilistic interpretation because of the occurrence of auxiliary variables, integrated in the complex plane. Hence, one is often unable to turn such a method into an exact bound. Nevertheless, as the different contributions to this book show, the power of non-variational MF techniques should not be ignored. This chapter does not aim at presenting any new results but rather tries to give a basic and brief introduction to three different MF approaches which will be discussed on a more advanced level in other chapters of this book. These are the Variational, the Field Theoretic and the TAP approaches. Throughout the chapter, we will explain the application of these methods for the case of an Ising model (also known as a Boltzmann machine in the field of Neural Computation). Our review of MF techniques is far from being exhaustive and we expect that other methods may play an important role in the future. Readers who want to learn more about Statistical Physics techniques and the MF method may consult existing textbooks e.g. [16; 19; 33]. A more thorough explanation of the variational method and its applications will be given in the chapters [5; 7; 9] of this book. A somewhat complementary review of advanced MF techniques is presented in the next chapter [32].
  • 25. 8 Manfred Opper and Ole Winther 2 The Variational Mean Field Method Perhaps the best known derivation of mean field equations outside the Statistical Physics community is the one given by the Variational Method. This method approximates an intractable distribution P(S) of a vector S = (S1, . . . , SN) of random variables by Q(S) which belongs to a family M of tractable distributions. The distribution Q is chosen such that it minimizes a certain distance measure D(Q, P) within the family M. To enable tractable computations, D(Q, P) is chosen as the relative entropy, or Kullback-Leibler divergence KL(QIIP) = �Q(S) ln ���� = (In �)Q ' (1) where the bracket (. . .)Q denotes an expectation with respect to Q. Since KL(QIIP) is not symmetric in P and Q, one might wonder if KL(PIIQ) would be a better choice (this question is discussed in the two chapters of [28; 1]). The main reason for choosing (22) is the fact that it requires only computations of expectations with respect to the tractable distribution Q instead of the intractable P. We will specialize on the class of distribution P that are given by e-H[S] P(S) = - Z-, (2) where S = (S1, . . . , SN) is a vector of binary (spin) variables Si E {-I,+1} and H[S] = - L SiJijSj-L SiOi . i<j Finally, the normalizing partition function is Z = L e-H[S] . S (3) (4) We are interested both in approximations to expectations like (Si) as well as in approximations to the value of the free energy -In Z . Inserting P into (22), we get KL(QIIP) = In Z +E[Q] -S[Q] (5) where S[Q] = - L Q(S)In Q(S) (6) S is the entropy of the distribution Q (not to be confused with the random variable S) and E[Q] = L Q(S)H[S] (7) S is called the variational energy.
  • 26. From Naive Mean Field Theory to the TAP Equations 9 The mean field approximation is obtained by taking the approximating family M to be all product distributions, i.e. Q(S) = II Qj(Sj) . j For Si E {-1,+1}, the most general form of the Qj's is obviously of the form: Q.(S.. .) - II (1+ Sjmj) J J,mJ - 2 j (8) (9) where the mj's are variational parameters which are identified as the expectations mj = (Sj)Q. Using the statistical independence of the Sj's with respect to Q, the variational entropy is found to be S[Q] = _ '" {1+ mi In 1+ mi + 1- mi In 1- mi } � 2 2 2 2 • and the variational energy reduces to E[Q] = (H[S])Q = - L Jijmimj - L mi()i . i<j (10) (11) Although the partition function Z cannot be computed efficiently, it will not be needed because it does not depend on Q. Hence, all we have to do is to minimize the variational free energy F[Q] = E[Q] - S[Q] . (12) Differentiating (12) with respect to the mi's gives the set ofN Mean Field Equations i = 1,...,N. (13) The intractable task of computing exact averages over P has been replaced by the problem of solving the set (13) of nonlinear equations, which often be done in a time that grows only polynomially with N. Note, that there might be many solutions to (13) and some of them may not even be local minima of (12) but rather saddles. Hence, solutions must be compared by their value of the variational free energy F[Q]. As an extra bonus of the variational MF approximation we get an upper bound on the exact free energy -lnZ. Since KL(QIIP) � 0, we have from (5) -lnZ::::; E[Q] -S[Q] = F[Q] . (14) Obviously, the mean field approximation takes into account the couplings Jij between the random variables but neglects statistical correlations, in the sense that (SiSj)Q = (Si)Q(Sj)Q. To get some more intuition about the effect of this approximation, we can compare the mean field equations for mi = (Si)Q (13) with a set of exact equations which hold for the true distribution P (2) . It is not hard to
  • 27. 10 Manfred Opper and Ole Winther prove the so-called Callen equations (see e.g. chapter 3 of [19]) i = 1,...,N. (15) Unfortunately both sides of (15) are formulated in terms of expectations (we have omitted the subscript) with respect to the difficult P. While in (15) the expectation is outside the nonlinear tanh function, the approximate (13) has the expectation inside the tanh. Hence, the MF approximation replaces the fluctuating "field" hi = L:jJijSj by (an approximation) to its mean field. Hence, estimating the variance of hi may give us an idea of how good the approximation is. We will come back to this question later. 3 The Linear Response Correction Although the product distribution Q(S) neglects correlations between the random variables, there is a simple way of computing a non-vanishing approximation to the covariances(SiSj) -(Si)(Sj) based on the MF approach. By differentiating (Si) = Z-l LSi e-H[S] S with respect to OJ, we obtain the linear response relation (16) (17) (17) holds only for expectations with respect to the true P but not for the approximating Q. Hoping that the MF method gives us a reasonable approximation for(Si), we can compute the MF approximation to the left hand side of (17) and get a nontrivial approximation to the right hand side. This approximation has been applied to Boltzmann machines learning [11] and independent component analysis [8]. 4 The Field Theoretic Approach Another way of obtaining a mean field theory is motivated by the idea that we often have better approximations to the performance of integrals than to the calculation of discrete sums. If we can replace the expectations over the random variables Si by integrations over auxiliary "field variables", we can approximate the integrals using the Laplace or saddle-point methods. As an example, we consider a simple Gaussian transformation of (2). To avoid complex representations we assume that the matrix J is positive definite so that
  • 28. From Naive Mean Field Theory to the TAP Equations we can write exp [��S;J;;S;1 (21f)N/2�i:IfdXi e-� �ijXi(r")ijXj+�iXiSi . 11 (18) This transformation is most easily applied to the partition function Z (4) yielding (19) where we have omitted some constants. In this representation, the sums over binary variables factorize and can be carried out immediately with the result Z ex Jr.rdXi ecI>(x) , , (20) where (21) Hence, we have transformed a high-dimensional sum into a high dimensional non­ Gaussian integral. Hoping, that the major contribution to the integral comes from values of the function <I> close to its maximum, we replace the integral (20) by (22) where xO = arg max <I> (x). This is termed the Laplace approximation. Setting the gradient V'x<I>(x) equal to zero, we get the set of equations �)J-l)ijX� = tanh(x? +Bi) . j (23) A comparison of (23) with (13) shows that by identifying the auxiliary variables x? with the mean fields via x? == LJijmj, j (24) we recover the same mean field equations as before. This is easily understood from the fact that we have replaced the integration variables Xi by constant values. This leaves us with a partition function for the same type of factorizing distribution Q(8) ex II eSj(x�+lij) j (25) (written in a slightly different form) that we have used in the variational approach. Hence, it seems we have not gained anything new. One might even argue that we have lost something in this derivation, the bound on the free energy -In Z. It is not clear how this could be proved easily within the Laplace approximation.
  • 29. 12 Manfred Opper and Ole Winther However, we would like to argue that when interactions between random variables are more complicated than in the simple quadratic model (7), the field-theoretic approach decouples the original sums in a very simple and elegant way for which there may not be an equivalent expression in the variational method. This can often be achieved by using a Dirac 6-function representation which is given by 1 = Jdh 6(h-x) = J d��h eih(h-x) , (26) where the i = A in the exponent should not be confused with a variable index. The transformation can be applied to partition functions of the type Z �Iff (�JjkSk) � J If {dhjf(hj) 6 (hj - �JjkSk)} JUJdh ��h j ) e-i '2:.; h;h; II {LeiSk '2:.; J;kh; } J k Sk (27) (28) Since the functions in (28) are no longer positive (in fact, not even real), the search for a maximum in <I> must be replaced by the Saddle-point method where (after a deformation of the path of integration in the the complex plane), one looks for values of h and hfor which the corresponding exponent is stationary. In general, the field theoretic MF approach does not have an equivalent vari­ ational formulation (in fact, depending on the way the auxiliary fields are chosen, we may get different MF formulations). Hence, it is unclear if the approximation to Z will lead to a bound for the free energy. While there is no general answer so far, an example given in one of the chapters of this book [22] indicates that in some cases this may still be true. A further important feature of the saddle-point approximation is the fact that it can be systematically improved by expanding <I> around the stationary value. The inclusion of the quadratic terms may already give a dramatic improvement. Applications of these ideas to graphical models can be found in this book [22; 2]. 5 When does MFT become exact? We have seen from the Callen equation (15) that the simple MF approximation neglects the fluctuations of the fields hi = LJijSj , j (29) which are sums of random variables. In the interesting case where N, the total number of variables Sj is large one might hope that fluctuations could be small assuming that the Sj are weakly dependent. We will compute crude estimates of these fluctuations for two extreme cases.
  • 30. From Naive Mean Field Theory to the TAP Equations 13 • Case I: All couplings Jij are positive and equal. In order to keep the fields hi of order 0(1) when N grows large, we set Jij = JoiN. This model is known as the mean field ferromagnet in Statistical Physics. If we make the crude approximation that all variables Sj are independent, the variances Var(JijSj) = J6 ( 1 - (Sj)2) IN2 of the individual terms in (29) simply add to a total variance of the fields Var(hi) = O(I/N) for N --+ 00. Hence, in this case the MF approximation becomes exact. A more rigorous justification of this result can be obtained within the field theoretic framework of the previous section. The necessary Gaussian transformation for this case is simpler than (18) and reads (30) Inserting (30) into the partition function (4) shows that Laplace's method for performing the single integral over x is justified for N --+ 00 by the occurrence of the factor N in the exponent. In practical applications of MF methods, the couplings Jij are usually related to some observed data and will not be constant but may rather show a strong variability. Hence, it is interesting to study the • Case II: The Jij's are assumed to be independent random variables (for i < j) with zero mean. Setting (J; = 0 for simplicity, we are now adding up N terms in (29) which have roughly equal fractions of positive and negative signs. To keep the hi's of order 1, the magnitude of the Jij's should then scale like 11m. With the same arguments as before, neglecting the dependencies of the Sj'S, we find that the variance of hi is now 0(1) for N --+ 00 and the simple MF approximation fails to become exact. As will be shown in the next section, the failure of the "naive" mean field theory (13) in case II can be cured by a adding a suitable correction. This leads us to the TAP mean field theory which is still a closed set of equations for the expectations (Si). Under some conditions on the variance of the Jij's it is believed that these mean field equations are exact for Case II in the limit N --+ 00 with probability 1 with respect to a random drawing of the Jij's. In fact, it should be possible to construct an exact mean field theory for any model where the Jij's are of "infinite range". The phrase infinite range is best understood if we assume for a moment that the spins Si are located at sites i on a finite dimensional lattice. If the Jij's do not decay to zero when the distance Iii -jII is large, we speak of an infinite range model. In such cases, the "neighbors" Sj of Si which contribute dominantly to the field hi (29) of a spin Si are not clustered in a small neighborhood of site i but are rather distributed all over the system. In such a case, we can expect that dependencies are weak enough to be treated well in a mean field approximation. Especially, when the connections Jij between two arbitrary spins Si and Sj are completely random (this includes sparse as well as extensive connectivities), the model is trivially of infinite range.
  • 31. 14 Manfred Opper and Ole Winther 6 TAP equations I : The cavity approach The TAP mean field equations are named after D.J. Thouless, P.W. Anderson and R.G. Palmer [29] who derived a MF theory for the Sherrington-Kirkpatrick (SK) model [26]. The SK model is of the type (3) where the couplings Jij are independent Gaussian random variables for i < j with variance JoiN. For simplicity, we set the mean equal to zero. We will give two derivations in this chapter. A further derivation and generalizations is presented in another chapter of this book [10]. Perhaps the most intuitive one is the cavity method introduced by Parisi and Mezard [16]. It is closely related to the Bethe approximation [3] which is an exact mean field theory on a tree. Our goal is to derive an approximation for the marginal distribution Pi(Si) for each spin variable. We begin with the exact representation Pi(Si) = L P(S) ex L eSi(�j JijSj+Oi) P(SSi) . SSi SSi (31) P(SSi) equals the joint distribution of the N - 1 spins SSi for an auxiliary system, where Si has been removed (by setting the Jij's equal to zero for all j =f. i). If the graph of nonzero Jij's would be a tree, i.e., if it would contain no loops, the Sj's would be fully independent after being disconnected from Si. In this case, the joint distribution P(SSi) would factorize into a product of individual marginals Pji(Sj). From this, one would obtain immediately the marginal distribution as (32) Within the tree assumption one could proceed further (in order to close the system of equations) by applying the same procedure to each of the auxiliary marginals Pji(Sj) and expressing them in terms of their neighbors (excluding Si). This would lead us directly to the Belief-propagation (BP) algorithm [21] for recursively computing a set of "messages" defined by mji(Si) = L eSdijSjPji(Sj) . (33) Sj This approach as well as its applications will be presented in more detail in other chapters [4; 30; 25; 32]. The route from the BP method to the TAP equations is presented in [13]. We will follow a different route which leads to considerable simplifications by utilizing the fact that the SK model is fully connected. Going back to the formulation (7), we see that the only dependence between Si and the other variables Sj is through the field hi = Lj JijSj. Hence, it is possible to rewrite the marginal distribution (32) in terms of the joint distribution of Si and hi (34)
  • 32. From Naive Mean Field Theory to the TAP Equations where we have introduced the "cavity" 1 distribution of hi as P(hiSi) = L 8(hi -L JijSj) P(SSi) . We get SSi j 15 (35) (36) For the SK model the independence used in (32) does not hold, but one may argue that it can be safely replaced in the following by sufficiently weak correlations. In the limit N --t 00, we assume that this is enough to invoke a central limit theorem for the field hi and replace (4) by the simple Gaussian distribution 2 P(h.S.) � _ I _ ex (_(hi - (hi)i)2) • • y'21fVi P 2Vi ' (37) in the computation of (36). We have denoted an average over the cavity distribution by Oi. Using (6) within (36) we get immediately i = 1, . . . ,N , (38) as the first part of the TAP equations. (38) should be compared to the corresponding set of "naive" MF equations (13) which can be written as i = 1,...,N. (39) In order to close the system of equations we have to express the cavity expectations (hi)i and the variances Vi in terms of the full expectations (hi) = L Jij(Sj) . j Within the Gaussian approximation (6) we get (hi) = L fdhi P(Si, hi) hi = (hi)i+Vi(Si) . Si (40) (41) Hence, only the variances Vi of the cavity field remain to be computed. By definition, they are Vi = L JijJik ((SjSk)i - (Sj)i(Sk)i) (42) j,k Since the Jij's are modeled as independent random variables we argue that the fluctuations of the Vi's with respect to the random sampling of the couplings can 1 the name is is derived from the physical context,where hi is the magnetic field at the cavity which is left when spin i is removed from the system. 2 The cavity method for a model with finite connectivity is discussed in [15].
  • 33. 16 Manfred Opper and Ole Winther be neglected for N --+ 00 and we can safely replace Vi by Vi = L J,& (l- (Sj)i) >::i � L (1- (Sj)2) , (43) j j where the bar denotes an average over the distribution of the Jij's. Note, that by the independence of the couplings, the averages over the Jij's and the terms (Sj)i factorize. To get the last expression in (43) we have assumed that both the fluctuations the effect of removing Si can be neglected in the sum. From equations (38),(41) and (43) we get the TAP equations for the SK model (44) where q = � Lj(l- (Sj)2). Equations (44) differ from the simple or "naive" MF equations (13) by the correction -Jo(l-q)(Si), which is usually called the Onsager Reaction Term. Although the simple MF approximation and the TAP approach are based on weak correlations between random variables, the TAP approach makes this assumption only when computing the distribution of the cavity field hi, i.e., for the case when Si is disconnected from the system. The Onsager term is the difference between (hi) and the cavity expectation (hi)i (compare (38) and (39)) and takes into account the reaction of the neighbors Sj due to the correlations created by the presence of Si. A full discussion about why and when (44) yields an exact mean field theory for the SK model is subtle and goes beyond the scope of this chapter. Interested readers are referred to [16]. We can only briefly touch the problems. The main property in deriving the TAP equations is the assumption of weak correlations expressed as (45) which can be shown to hold for the SK model when the size of the couplings Jo is sufficiently small. In this case, there is only a single solution to (44). Things become more complicated with increasing Jo. Analytical calculations show that one enters a complex free energy landscape, i.e. a (spin glass) phase of the model where one has exponentially many (in N) solutions. This corresponds to a multimodal distribution with many equally important modes. (14) is no longer valid for a full average but for local averages within a single mode. Numerical solutions to the TAP equations turn out to be extremely difficult in this region [17] and not all of them can be accepted because they violate the positive definiteness of the covariance matrix (SiSj) - (Si)(Sj). For a setup of the cavity approach in this complex region see chapter V of [16] and in this volume [31] which also discusses its application to stochastic dynamics. Finally, we want to mention the work of M. Talagrand (see e.g. [27]) who is developing a rigorous mathematical basis for the cavity method.
  • 34. From Naive Mean Field Theory to the TAP Equations 7 TAP equations II: Plefka's Expansion 17 Plefka's expansion [23] is a method for deriving the TAP equations by a systematic perturbative computation of a function G(m) which is minimized by the vector of expectationsm = (S). To define G(m), we go back to the minimization of the variational free energy (12), and do not restrict the distributions Q to be product distributions. We minimize F(Q) = E[Q] - S[Q] in two steps: In the first step, we perform a constrained minimization in the family of all distributions Qm which satisfy (S)Q =m (46) wherem is fixed. We define the Gibb '8 Free Energy as the constrained minimum G(m) = min{E[Q] - S[Q] I (S)Q =m} . Q (47) In the second step, we minimize G with respect to the vectorm. Since the full minimizer of F[Q] equals the true distribution P, the minimizer of G(m) coincides with the vector of true expectations (Si) . Constrained optimization problems like (47) can be transformed into uncon­ strained ones by introducing appropriate Lagrange multipliers hi where we have to minimize (48) and the hi's must be chosen such that (46) holds. (48) is again of the form of a variational free energy (12) where H[Q] is replaced by H[Q] - Li hiSi . Hence, the minimizing distribution is just (49) with Z-l(h) = Ls e-H[Sl+�ihiSi . Inserting this solution back into (47) yields G(m, h) = Lhimi-lnLe-H[sl+�ihiSi . (50) S The condition (46) on the hi can be finally introduced by the variation on the vector h G(m) = m:;x {�himi -In �e-H[Sl+�ihiSi } . (51) This follows by setting the gradient with respect to h equal to zero and checking the matrix of second derivatives. The geometric meaning of the function G(m) within Amari's Information Geometry is highlighted in the chapters [28; 1]. Why do we bother solving the more complicated 2-stage optimization process, when computing G(m) is as complicated as computing the exact free energy F[P] = -In Z? It turns out, that a useful perturbation expansion of G(m) with respect to the complicated coupling term H[S] can be developed. We replace H[S]
  • 35. 18 Manfred Opper and Ole Winther by 'H[S] in ( 51) and expand (setting (}i = 0 for simplicity) ,2 G(m) = Go(m)+,G1(m)+,G2(m)+. . . 2. ( 52) with Gn = ::n G(m)1>..=0. The computation of the Gn is a bit tricky because one also has to expand the Lagrange parameters hi which maximize ( 51) in powers of '. However, the first two terms are simple. To zeroth order we obtain mi = tanh(h?) and G ( ) - 2: {1 +mi I 1 +mi 1 - mi I 1 - mi } om - --- n --- + --- n --- . . 2 2 2 2 • ( 53) The calculation of the first order term is also simple, because the first derivative of G at , = 0 can be written as an expectation of H[S] with respect to a factorizing distribution with mean values (Si) = mi. We get G1(m) = - 2: Jijmimj . i<j (54) A comparison of the first two terms with (12), (23) and (24) shows that we have already recovered the simple mean field approximation. One can show that the second order term in the expansion is I" 2 2 2 G2(m) = -"2 � Jij(l-mi) (l-mj) . ij (55) Minimizing (52) with respect tom for , = 1 and keeping only terms up to second order, yields the TAP expansion (44)3 . Plefka's method allows us to recover the TAP equations from a systematic expansion, which in principle allows for improvements by adding higher order terms. Corrections of this type can be found in other chapters in this book [32; 28]. Moreover, the approximate computation of G(m) can be used to get an approximation for the free energy -In Z = F[P] = minm G(m) as well. For the SK model, Plefka [23] shows that all terms beyond second order in the , expansion (52) can be neglected with probability 1 (with respect to random drawings of the Jij's) for N ---+ 00 as long as we are not in the complex (spin glass) phase of the model. 8 TAP equations III: Beyond the SK model The TAP approach is special among the other mean field methods in the sense that one has to make probabilistic assumptions on the couplings Jij in (3) in order to derive the correct MF equations. This causes extra problems because the magnitude of the Onsager correction term will depend on the distribution of Jij's. E.g., both the SK model and the Hopfield model [6] belong to the same class of models (3) but are defined by different probability distributions for the couplings Jij. 3 One also has to replace Ji� by its average.
  • 36. From Naive Mean Field Theory to the TAP Equations 19 The weak correlations that are present between the couplings in the Hopfield model prevent us from using the same arguments that has led us to (43). In fact, the derivation presented in the chapter XIII of [16] leads to a different result. A similar effect can be observed in the Plefka expansion (52). If the couplings are not simple i.i.d. random variables, the expansion can not be truncated after the second order term. An identification of terms which survive in the limit N --t 00 is necessary [20]. Is there a general way of deriving the correct TAP equations for the different distributions of couplings? The chapters [13] and [18] present different approaches to this problem. The first one is based on identifying new auxiliary variables and couplings between them for which independence is still valid. This leads to TAP like equations which are valid even for a sparse connectivity of couplings. However, the explicit knowledge of the underlying distribution of couplings is required. The second approach motivated by earlier work of [20] develops an adaptive TAP method which does not make explicit assumptions about the distribution. It is however restricted to extensive connectivities. 9 Outlook We have discussed different types of mean field methods in this chapter. Although we were able to show that in certain limits these approximations become exact, we can not give a general answer to the question how well they will perform on arbitrary real data problems. The situation is perhaps simpler in statistical physics, where there is often more detailed knowledge about the properties of a physical system which helps to motivate a certain approximation scheme. Hence a critical reader may argue that, especially in cases where MF approaches do not lead to a bound, these approximations are somewhat uncontrolled and can not be trusted. We believe that the situation is less pessimistic. We have seen in this chapter that the MF equations often appear as low order terms in systematic perturbation expansions. Hence, a computation of higher order terms can be useful to check the accuracy of the approximation and may possibly also give error bars on the predictions. We hope that further work in this direction will provide us with approximation methods for complex probabilistic models which are both efficient as well as reliable. References [l]Amari S.,Ikeda S. and Shimokawa H.,this book. [2]BarberD.,this book. [3]Bethe H. A.,Proc. R. Soc. London,Ser A,151, 552 (1935). [4]Frey B.J. and Koetter R.,this book. [5]Ghahramani Z. and Beal M.J.,this book. [6]Hopfield J.J.,Proc. Nat. Acad. Sci. USA,79 2554 (1982). [7]Humphreys K. and Titterington D.M.,this book. [8]H0jen-S0rensen,P.A.d.F.R.,Winther,0.,and Hansen,L. K.,Ensemble Learning and Linear Response Theory for leA,Submitted to NIPS'2000 (2000). [9]Jaakkola T.,this book. [lO]Kappen H.J. and Wiegerinck W.,this book.
  • 37. 20 Manfred Opper and Ole Winther [lllKappen H.J. and Rodriguez F.B.,Efficient Learning in Boltzmann Machines Using Linear Response Theory,Neural Computation 10,1137 (1998). [12lKabashima Y. and Saad D.,Belief propagation vs. TAP for decoding corrupted messages, Europhys. Lett. 44, 668 (1998) [13lKabashima Y. and Saad D.,this book. [14lMezard M.,The Space of interactions in Neural Networks: Gardner's Computation with the Cavity Method,J. Phys. A (Math. Gen. 22,2181 (1989). [15lMezard M. and Parisi G.,Mean Field Theory of Randomly Frustrated Systems with Finite Connectivity,Europhys. Lett. 3,1067 (1987). [16lMezard M.,Parisi G. and Virasoro M.A.,Europhys. Lett. 1,77 (1986) and Spin Glass Theory and Beyond, Lecture Notes in Physics,9,World Scientific (1987). [17lNemoto K. and Takayama H.,J. Phys. C 18,L529 (1985). [18l0pper M. and Winther 0.,this book. [19lParisi G.,Statistical Field Theory,Addison Wesley,Reading Massachusetts (1988). [20lParisi G. and Potters M.,Mean-Field Equations for Spin Models with Orthogonal Interaction Matrices,J. Phys. A (Math. Gen.) 28,5267 (1995). [21lPearl J.,Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann,San Francisco (1988). [22lPineda F.J.,Resch C. and Wang LJ.,this book. [23lPlefka T.,Convergence condition of the TAP equations for the infinite-ranged Ising spin glass model,J. Phys. A 15,1971 (1982). [24lSaul L.K.,Jaakkola T.,Jordan M.L,Mean Field Theory for Sigmoid Belief Networks,J. Artificial Intelligence Research 4, 61-76 (1996). [25lSaad D.,Kabashima Y. and Vicente R.,this book. [26lSherrington D. and Kirkpatrick S.,Phys. Rev. Lett. 35,1792 (1975). [27lTalagrand M.,Self Averaging and the Space of Interactions in Neural Networks,Random Structures and Algorithms 14, 199 (1998) and also papers on his webpage http://www.math.ohio-state.edu/ - talagran/. [28lTanaka T.,this book. [29lThouless D.J.,Anderson P.W. and Palmer R.G.,Solution of a 'Solvable Model of a Spin Glass',Phil. Mag. 35,593 (1977). [30lWeiss Y.,this book. [31lWong K.Y.,Li S. and Luo P.,this book. [32lYedidia J.S.,this book. [33lZinn-Justin J.,Quantum Field Theory and Critical Phenomena,Clarendon Press,Oxford (1989).
  • 38. 3 An Idiosyncratic Journey Beyond Mean Field Theory Jonathan S. Yedidia The connecting thread between the different methods described here is the Gibbs free energy. After introducing the inference problem we are interested in analyzing, I will define the Gibbs free energy, and describe how to derive a mean field approximation to it using a vari­ ational approach. I will then explain how one might re-derive and correct the mean field and TAP free energies using high tempera­ ture expansions with constrained one-node beliefs. I will explore the relationships between the high-temperature expansion approach, the Bethe approximation, and the belief propagation algorithm, and point out in particular the equivalence of the Bethe approximation and be­ lief propagation. Finally, I will describe Kikuchi approximations to the Gibbs Free energy and advertise new belief propagation algo­ rithms that efficiently compute beliefs equivalent to those obtained from the Kikuchi free energy. 1 Introduction In this chapter I will try to clarify the relationships between different ways of deriving or correcting mean field theory. The December 1999 NIPS workshop on "Advanced Mean Field Methods" succeeded nicely in bringing together physicists and computer scientists, who nowadays often work on precisely the same problems, but come to these problems with different perspectives, methods, names and notations. Some of this chapter is therefore devoted to presenting translations between the language of the physicist and the language of the computer scientist, although I am sure that my original training as a physicist will show through. I will only cover methods that I have personally used, so this chapter does not attempt to be a thorough survey of its subject. Readers interested in more background on the statistical physics of disordered systems (particularly with regard to the technique of averaging over disorder using the replica method) might also want to consult references [19],[28],and [31],while those interested in the computer science literature on graphical models might consult references [23], [11] and [7]. 2 Inference We begin by describing the problem we will focus on. In the appealing computer science jargon, this is the problem of "inference." We are given some complicated probabilistic system, which we model by a pair-wise Markov network of N nodes. We label the state of node i by Xi, and write the joint probability distribution
  • 39. 22 Jonathan S. Yedidia function as (1) Here 'l/Jij(Xi,Xj) is the "compatibility" matrix between connected nodes i and j, 'l/Ji(Xi) is called the "evidence" for node i, and Z is a normalization constant called the "partition function" by physicists. The notation (ij) means that the sum is over connected nodes. Such models have many applications, in fields as diverse as computer vision, error-correcting codes, medical diagnosis, and condensed matter physics. It may help your intuition to think of the medical diagnosis application. In such an application, the nodes could represent symptoms and diseases that a patient may have, and the links 'l/Jij(Xi,Xj) could represent the statistical dependencies between the symptoms and diseases. Note that the links 'l/Jij(Xi,Xj) would not normally change from one patient to the next. On the other hand, for each patient, we would obtain a different set of evidence 'l/Ji(Xi), which would correspond to our knowledge of the symptoms for that specific patient. We would like to use the model to infer the probability that the patient has a specific disease-that is, we want to compute a marginal probability like Pi(Xi),which is the probability that the patient has the disease denoted by node i. I will just give a very rough idea of how such a model might be useful for other applications. In a computer vision application, we might be interested in inferring the shape of an object from the evidence provided by the pixel values of the image. In an error-correcting code, we might be interested in inferring (decoding) the most likely interpretation of a noisy message, where the Markov network itself enforces the error-correcting code. In condensed matter physics, we might want to infer (predict) the response of a magnetic system to the "evidence" of an inhomogeneous magnetic field. For the rest of the chapter, however, I will not make specific interpretations of the meanings of the nodes, and focus on the mathematics of the problem. For some networks-small ones or networks that have the topology of a chain or tree-we can compute any desired marginal probabilities exactly, either by explicitly summing over all possible states of the system or by using dynamic programming methods (we will return to the dynamic programming methods, which are also called "belief propagation" algorithms, later in the chapter.) Otherwise, however, we must settle for approximations. If we want to make a distinction between the exact marginal probabilities and approximate ones (something physicists do not usually bother doing explicitly), then we can call the approximation of the exact marginal probability Pi(Xi) the "belief" bi(Xi), and similarly we call the approximation of the exact two-node marginal probability Pij(Xi,Xj) the belief bij(Xi,Xj). The mathematical problem we will focus on for the rest of this chapter is as follows: given some arbitrary Markov network defined as in equation (1),compute as accurately as possible any desired beliefs.
  • 40. An Idiosyncratic Journey Beyond Mean Field Theory 3 Some Models from Statistical Physics 23 In statistical mechanics, we start with Boltzmann's law for computing joint prob­ ability functions: (2) where E is the energy of the system and T is the temperature. We can re-write equation (1) in this way if we define E(Xl,X2,...,XN) = - LJij(Xi,Xj)-Lhi(Xi) (ij) where the "bond strength" function Jij(Xi,Xj) is defined by: and the "magnetic field" hi(Xi) is defined by: (3) Before turning to approximation methods, let us pause to consider some more general and some more specific models. Turning first to more specific models, we can obtain the Ising model by restricting each node i to have two states Si = ±1 (for the Ising case, we follow the physics convention and label the states by Si instead of Xi), and insisting that the compatibility matrices ¢ij have the form _ ( exp(JijIT) exp(-JijIT) ) . . £ ¢ij - exp(-JijIT) exp(JijIT) whlle the eVldence vectors have the lorm ¢i = (exp(hdT); exp(-hi)IT). In that case, we can write the energy as E = - LJijSiSj - Lhisi· (4) (ij) If we further restrict the Jij to be uniform and positive, we obtain the ferromagnetic Ising model, while if we assume the Jij are chosen from a random distribution, we obtain an Ising spin glass. For these models, the magnetic field hi is usually, but not always, assumed to be uniform. We can create more general models by introducing tensors like ¢ijdxi,Xj,Xk) in equation (1) or equivalently tensors like Jijk(Xi,Xj,Xk) in the energy. One can of course introduce tensors of even higher order. In the extreme limit, one can consider a model where E(Xl,X2,...,XN) = J12...N(Xl,X2,...,XN). If the Xi are binary and the entries of this J tensor are chosen randomly from a Gaussian distribution, we obtain Derrida's Random Energy Model [4]. So far, we have been implicitly assuming that the nodes in the Markov network live on a fixed lattice and that each node can be in a discrete state Xi. In fact, there is nothing to stop us from taking the Xi to be continuous variables, or we can generalize to vectors Ti, where Ti can be interpreted as the position of the ith particle in the system. Looking at it this way, we see that equation (3) can be interpreted as an energy function for particles interacting by arbitrary two-body
  • 41. 24 forces in arbitrary one-body potentials. 4 The Gibbs Free Energy Jonathan S. Yedidia Statistical physicists often use the following algorithm when they consider some new model of a physical system: 1. Write down the energy function. 2. Construct an approximate Gibbs free energy. 3. Solve the stationary conditions of the approximate Gibbs free energy. 4. Write paper. To use this algorithm successfully, one needs to understand what a Gibbs free energy is, and how one might successfully approximate it. We will explore this subject from numerous points of view. The exact Gibbs free energy Gexact can be thought of as a mathematical construction designed so that when you minimize it, you will recover Boltzmann's law. Gexact is a function of the full joint probability function P(Xl'X2,..., XN ) and is defined by (5) where U is the average (or "internal") energy: U= (6) and S is the entropy: S=- (7) Xl,X2,···,XN If we minimize Gexact with respect to P(Xl'X2,..., XN ) (one needs to remember to add a Lagrange multiplier to enforce the constraint LX1,X2,...,XN P(Xl'X2,..., XN )= 1),we do indeed recover Boltzmann's Law (equation (2)) as desired. If we substitute in P=exp(-E/T)/Z into Gexact, we find that at equilibrium (that is, when the joint probability distribution has its correct value), the Gibbs free energy is equal to the Helmholtz free energy defined by F == -TIn Z. One can understand things this way: the Helmholtz free energy is just a number equal to U - TS at equilibrium, but the Gibbs free energy is a function that gives the value of U - TS when some constraints are applied. In the case of Gexact, we constrain the whole joint probablity function P(Xl'X2,...,XN ). In other cases that we will look at shortly, we will just constrain some of the marginal probabilities. In general, there can be more than one "Gibbs free energy"-which one you are talking about depends on which additional constraints you want to apply. When we minimize a Gibbs free energy with respect to those probabilities that were constrained, we will obtain self-consistent equations that must be obeyed in equilibrium.
  • 42. An Idiosyncratic Journey Beyond Mean Field Theory 25 The advantage of working with a Gibbs free energy instead of Boltzmann's Law directly is that it is much easier to come up with ideas for approximations. There are in fact many different approximations that one could make to a Gibbs free energy, and much of the rest of this chapter is devoted to surveying them. 5 Mean Field Theory: The Variational Approach One very popular way to construct an approximate Gibbs free energy involves a variational argument. The derivation given here will be from a physicist's perspec­ tive; for an introduction to variational methods from a different point of view, see [12]. Assume that we have some system which can be in, say, K different states. The probability of each state is some number Pa where L�=l Pa = 1. Let there be some quantity Xa (like the energy) which depends on which state the system is in, and introduce the notation for the mean value K (X) == L PaXa· (8) a=l Then by the convexity of the exponential function, we can prove that (9) Now consider the partition function Z = L exp(-Ea/T). (10) Let us introduce some arbitrary "trial" energy function E�. We can manipulate Z into the form Z = Laexp(-(Ea - E�)/T)exp(-E�/T) '" (-EO/T) La exp(-E&/T) L...- exp a a or Z = (e-(E-EO)/T)° L exp(-E�/T) a (11) (12) where the notation (X)o means the average of Xa using a trial probability distri­ bution ° exp(-E�/T) P - a - La exp(-E&/T) . We can now use the inequality (9) to assert that Z � e-((E-EO)/T)o L exp(-E�/T) (13) (14) for any function E�. In terms of the Helmholtz free energy F == -TIn Z, we can equivalently assert that (15)
  • 43. 26 Jonathan S. Yedidia where we define the quantity on the right-hand side of the inequality as the variational mean field free energy Fvar corresponding to the trial probability function p�. A little more manipulation gives us Fvar = ( E)o -TSo � F (16) where So is the trial entropy defined by So = -Eo< p�Inp�. This inequality gives us a useful variational argument: we will look for the trial probability function p� which gives us the lowest variational free energy. To be able to use the variational principle in practice, we must restrict ourselves to a class of probabilities for which we can actually analytically compute Fvar. The quality of the variational approximation will depend on how well the trial probability function can represent the true one. For continuous Xi or Ti, one can use Gaussians as very good, yet tractable variational functions [28; 2; 3]. Richard Feynman was one of the first physicists to use this kind of variational argument (with Gaussian trial probability functions) in his treatment of the polaron problem [5]. The variational probability functions that are tractable for discrete Xi are not nearly as good. When people talk about "mean field theory," they are usually referring to using a trial probability function of the factorized form (17) and computing Fvar for some energy function of a form like equation (3). The "mean field" Gibbs free energy that results is GMF -L L Jij(Xi,Xj)bi(Xi)bj(xj) - L L hi(Xi)bi(Xi) (ij) Xi ,X; Xi Xi (18) To obtain the beliefs in equilibrium according to this approximation, one minimizes GMF with respect to the beliefs bi(Xi). Let us see how this works for the Ising model with no external field. In that case, it makes sense to define the local magnetization (19) which is a scalar that can take on values from -1to 1.In terms of the magnetization, we have - LJijmimj (ij) +T � [1+ 2 miIn C+ 2 mi) + 1- 2 miIn C- 2 mi)] , and the mean field stationary conditions are (20) (21)
  • 44. An Idiosyncratic Journey Beyond Mean Field Theory 27 If we further specialize to the case of a ferromagnet on ad-dimensional hyper­ cubic lattice, set all the Jij = 2 1d' and assume that all mi are equal to the same magnetization m, we can analytically analyze the solutions of this equation. We find that above Te = 1, the only solution is m = 0, while below Te, we have two other solutions with positive or negative magnetization. This is a classic example of a phase transition that breaks the underlying symmetry in a model. The mean field prediction of a phase transition is qualitatively correct for dimensiond � 2. Other bulk thermodynamic quantities like the susceptibility X == amlah and the specific heat C == auIaT are also easy to compute once we have the stationary conditions. How good an approximation does mean field theory give? It depends a lot on the model. For the Ising ferromagnet, mean field theory becomes exact for a hyper­ cubic lattice in the limit of infinite dimensions, or for an "infinite-ranged" lattice where every node is connected to every other node. On the other hand, for lower dimensional ferromagnets, or spin glasses in any dimension, mean field theory can give quite poor results. In general, mean field theory does badly when the nodes in a network fluctuate a lot around their mean values, because it incorrectly insists that all two-node beliefs bij(Xi,Xj) are simply given by bij(Xi,Xj) = bi(Xi)bj(xj). In practice, one sees many papers where questionable mean field approximations are used when it would not have been too difficult to obtain better results using one of the techniques that I describe in the rest of the chapter. 6 Correcting Mean Field Theory Mean field theory is exact for the infinite-ranged ferromagnet, so when physicists started contemplating spin glasses in the 1970's, they quickly turned to the simplest corresponding model: the infinite-ranged Sherrington-Kirpatrick (SK) Ising spin glass model with zero field and Jij's chosen from a zero-mean Gaussian distribution [25]. Thouless, Anderson and Palmer (TAP) presented "as a fait accomplz" [26] a Gibbs free energy that they claimed should be exact for this model: -(3GTAP L [1 + mi (1 + mi) 1 - mi (1 - mi)] - -- In -- + -- In -- . 2 2 2 2 • +(3 L Jijmimj + �2 L f&(1 -m;)(1 -m�) (ij) (ij) (22) where (3 == liT is the inverse temperature. The only difference between the TAP and ordinary mean field free energy is the last term, which is sometimes called the "Onsager reaction" term. I have written the TAP free energy in a suggestive form: it appears to be a Taylor expansion in powers of (3. Plefka showed that one could in fact derive GTAP from such a Taylor expansion [24]. Antoine Georges and I later [10] showed how to continue the Taylor expansion to terms beyond 0((32), and exploited this kind of expansion for a variety of statistical mechanical [8; 30] and quantum mechanical [9] models. Of course, the higher-order terms are important for any model that is not infinite-ranged. Because this technique is little-known, but quite generally
  • 45. 28 Jonathan S. Yedidia applicable, I will review it here using the Ising spin glass energy function. The variational approximation gives a rigorous upper bound on the Helmholtz free energy, but there is no reason to believe that it is the best approximation one can make for the magnetization-dependent Gibbs free energy. We can construct such a Gibbs free energy by adding a set of external auxilary fields (Lagrange multipliers) that are used to insure that all the magnetizations are constrained to their desired values. Note that the auxiliary fields are temperature-dependent. Of course, when the magnetizations are at their equilibrium values, no auxiliary fields will be necessary. We write (23) where the )..( (3) are our auxiliary fields. We can use this exact formula to expand - (3G( (3,mi) around (3 = 0: (24) At (3 = 0,the spins are entirely controlled by their auxiliary fields, and so we have reduced our problem to one of independent spins. Since mi is fixed equal to (Si) for any inverse temperature (3,it is in particular equal to (Si) when (3 = 0,which gives us the relation (25) From the definition of - (3G( (3,mi) given in equation (23),we find that (26) Eliminating the )..i( O),we obtain -( (3G){3=o = -L [1 + 2 miIn C+ 2 mi) + 1 - 2 miIn C- 2 mi)] , (27) which is just the mean field entropy. Considering next the first derivative, we find that - (3 (a �(3 G)) = (3 (LJijSiSj) + (3 (Si -mi){3=0 �).. (3 i _ . {3=0 (. .) {3-0 'J {3=0 At (3 = 0,the two-node correlation functions factorize so we find that which is, of course, the same as the variational internal energy term. (28) (29) Naturally, we can continue this expansion to arbitrarily high order if we work hard enough. Unfortunately, neither Georges and I, nor Parisi and Potters who
  • 46. An Idiosyncratic Journey Beyond Mean Field Theory 29 later examined this expansion [22], were able to derive the Feynman rules for a fully diagrammatic expansion, but there are some tricks that make the computation easier [10]. To order /34,we find that -/3G = - � [1 + 2 miIn C+ 2 mi) + 1 - 2 miIn C- 2 mi)] • +/3 LJijmimj (ij) +� 2 L J�(1 -m�)(1 -m�) (ij) + 2 �3 L Jtmi(1 -m�)mj(1 -m�) (ij) +/33 L JijJjkJki(1 -m�)(1 -m�)(1 -m�) (ijk) -�� L J0(1 -m�)(1 -m�)(1 + 3m� + 3m� -15m�m�) (ij) +2/34 L J�Jjdkimi(1 -m�)mj(l -m�)(1 -m�) (ijk) +/34 L JijJjdkdli(1 -m�)(I -m�)(I -m�)(1 -m?) (ijkl) +... (30) where the notation (ij), (ijk), or (ijkl) means that one should sum over all distinct pairs, triplets, or quadruplets of spins. For the ferromagnet on ad-dimensional hypercubic lattice, all these terms can be reorganized according to their contribution in powers of lid. It is easy to show that only the mean field terms contribute in the limitd ---+ 00 and to generate lid expansions for all the bulk thermodynamic quantities, including the magnetization [10]. A few points should be made about the Taylor expansion of equation (30). First, as with any Taylor expansion, there is a danger that the radius of convergence of the expansion will be too small to obtain results for the value of /3 you are interested in. It is hard to say anything about this issue in general. For ferromagnets, there does not seem to be any problem at low or high temperatures, but for the SK model, the issue is non-trivial and was analyzed by Plefka [24]. Secondly, since the expansion was presented as one that starts at /3 = 0, it is initially surprising that it can work at low temperatures. The explanation, at least for the ferromagnetic case, is that the higher-order terms become exponentially small in the limit T ---+ O. Thus, the expansion works very well for T ---+ 0 or T ---+ 00 and is worst near Tc. Finally, the TAP free energy is sometimes justified as a "Bethe approximation," that is, as an approximation that would become exact on a tree-like lattice [1]. In fact, the general convention in the statistical physics community is to refer to
  • 47. 30 Jonathan S. Yedidia the technique of using a Bethe approximation on a inhomogeneous model as the "TAP approach." In general, to obtain the proper Bethe approximation from the expansion (30) for models on a tree-like lattice, we need to sum over all the higher­ order terms that do not include loops of nodes. The TAP free energy for the SK model only simplifies because for that model all terms of order /33 or higher are believed to vanish anyways in the limit N -+ 00 (which is the "thermodynamic limit" physicists are interested in). In the next section, we will describe a much simpler way to arrive at the important Bethe approximation. 7 The Bethe Approximation The remaining sections of this chapter will discuss the Bethe and Kikuchi approx­ imations and belief propagation algorithms. My understanding of these subjects was formed by a collaboration with Bill Freeman at MERL and Yair Weiss at Berkeley. These sections can be considered an introduction to the work that we did together [29]. So far we have discussed Gibbs free energies with just one-node beliefs bi(Xi) constrained. The next obvious step to take is to constrain the two-node beliefs bij(Xi,Xj) as well. For Markov networks that have a tree-like topology, taking this step is sufficient to obtain the exact Gibbs free energy. The reason is that for these models, the exact joint probability distribution itself can be factorized into a form that only depends on one-node and two-node marginal probabilities: p(Xl,X2,...,XN) = IIpij(xi,Xj) II[pi(xiW-qi (ij) where qi is the number of nodes that are connected to node i. (31) Recall that the exact Gibbs free energy is G = U - TS, where the internal energy U = Lcr.pcr.Eoil the entropy S = Lcr.pcr.lnPcr.,and a is an index over every possible state. Using equation (31),we find that the exact entropy for models with tree-like topology is (32) (ij) Xi ,Xj Xi The average energy can be expressed exactly in terms of one-node and two-node marginal probabilities for pair-wise Markov networks of any topology: U = - L L Pij(Xi,Xj)(Jij(Xi,x,j) + hi(Xi) + hj(xj)) (ij) Xi ,Xj (33) Xi The first term is just the average energy of each link, and the second term is a correction for the fact that the evidence at each node is counted qi - 1 times too many. The Bethe approximation to the Gibbs free energy amounts to using these expressions (with beliefs substituting for exact marginal probabilities) for any pair-
  • 48. An Idiosyncratic Journey Beyond Mean Field Theory wise Markov network: LLbij(Xi,Xj)(TInbij(Xi,Xj) + Eij(Xi,Xj)) (ij) Xi ,Xj Xi 31 (34) where we have introduced the local energies Ei(Xi) == -hi(Xi) and Eij(Xi,Xj) == -Jij(Xi,Xj) -hi(xi) -hj(Xj). Of course, the beliefs bij(Xi,Xj) and bi(Xi) must obey the standard normalization conditions EXi bi(Xi) = 1and Eijbij(Xi,Xj) = 1and marginalization conditions bi(Xi) = Ex. bij(Xi,Xj). 3 There is more than one way to obtain the stationarity conditions for the Bethe free energy. For inhomogeneous models, the most straightforward approach is to form a Lagrangian L by adding Lagrange multipliers which enforce the normalization and marginalization conditions and to differentiate the Lagrangian with respect to the beliefs and those Lagrange multipliers. We have L = GBethe + LLAij(Xj) (bj(Xj) -Lbij(Xi,Xj)) (ij) Xj Xi + LLAji(Xi) (bi(Xi) -Lbij(Xi,Xj)) (ij) Xi Xj + �'Yi (1-Lbi(Xi)) + 2;:'Yij (1- Lbij(Xi,Xj)) o Xi (OJ) Xi ,Xj (35) Of course, the derivatives with respect to the Lagrange multipliers give back the desired constraints, while the derivatives with respect to the beliefs give back equations for beliefs in terms of Lagrange multipliers: (36) and (37) where Zi and Zij are constants which enforce the normalization conditions. Finally one can use the marginalization conditions to obtain self-consistent equations for the Lagrange multipliers. The Bethe approximation is a significantly better approximation to the Gibbs free energy than the mean field approximation. The only real difficulty is a practical one: how do we minimize the Bethe free energy efficiently? As we shall see, it turns out that the belief propagation algorithm, which was developed by Pearl following an entirely different path, provides a possible answer.
  • 49. 32 Jonathan S. Yedidia 8 Belief Propagation Belief propagation algorithms can probably best be understood by imagining that each node in a Markov network represents a person, who communicates by "messages" with those people on connected nodes about what their beliefs should be. Let us see what the properties of these messages should be if we want to get reasonable equations for the beliefs bi(Xi). We will denote the message from node j to node i by Mji(Xi). Note that the message has the same dimensionality as node i­ the person at j is telling the one at i something like "you should believe in your state 1 twice as strongly as your state 2, and your state number 3 should be impossible." That message would be the vector (2, 1, 0). Now imagine that the person at node i is looking at all the messages that he is getting, plus the independent evidence that he alone is receiving denoted by 'l/Ji(Xi). Assume that each message is arriving independently and is reliably informing the person at node i about something he has no other way of finding out. Given equally reliable messages and evidence, what should his beliefs be? A reasonable guess would be bi(Xi) = a'I/Ji(xi) II Mji(Xi) (38) jEN(i) where a is a normalization constant, and N(i) denotes all the nodes neighboring i. Thus a person following this rule who got messages (2, 1, 0) and (1, 1, 1) and had personal evidence (1, 2, 1) would have a belief (.5, .5, 0). His thought process would work like this: "The first message is telling me that state 3 is impossible, the second message can be ignored because it is telling me it does not care, while my personal evidence is telling me to believe in state 2 twice as strongly as state 1, which is the opposite of what the first message tells me, so I will just believe in state 1 and state 2 equally strongly." Now consider the joint beliefs of a pair of neighboring nodes i and j. Clearly they must depend on the compatibility matrix 'l/Jij(Xi,Xj),the evidence at each node 'l/Ji(Xi) and 'l/Jj(Xj), and all the messages coming into nodes i and j. The obvious guess would be the rule kEN(i)j lEN(j)i If we combine these rules for the one-node and two-node beliefs with the marginal­ ization condition Xj we obtain the self-consistent equations for the messages Mij(Xj) = aL'l/Jij(Xi,Xj)'l/Ji(Xi) II Mki(Xi) Xi kEN(i)j (40) (41) where N(i)j means all nodes neighboring i except for j. The belief propagation algorithm amounts to solving these message equations iteratively, and using the
  • 50. An Idiosyncratic Journey Beyond Mean Field Theory 33 solution for the messages in the belief equations. So far I have probably just convinced you that the belief propagation algorithm is vaguely plausible. Pearl did more than that of course-he showed directly that all the belief propagation equations written above are exact for Markov networks that have a tree-like topology [23). One might note that this fact was already partially known in the physics literature-as long ago as 1979, T. Morita wrote down the correct belief propagation equations for the case of an Ising spin glass in a random field [20). Of course, the suitability of these equations as an algorithm was not appreciated. Recently, Y. Kabashima and D. Saad [13; 14) have shown that for a number of other specific disordered models, the TAP approach and belief propagation give rise to identical equations, and speculated that this might be true in general. Freeman, Weiss and I have shown that this identity does in fact hold in gen­ eral [29). To prove it for general Markov networks, you simply need to identify the following relationship between the Lagrange multipliers Aij(Xj) that we introduced in the last section and the messages Mij(xj): Aij(Xj) = TIn II Mkj(xj) (42) kEN(j)i Using this relation, one can easily show that equations (36) and (37) derived for the Bethe approximation in the last section are equivalent to the belief propagation equations (38) and (39). 9 Kikuchi Approximations and Generalized Belief Propagation Pearl pointed out that belief propagation was not exact for networks with loops, but that has not stopped a number of researchers from using it on such networks, often very successfully. One particularly dramatic case is near Shannon-limit performance of "Turbo codes" and low density parity check codes, whose decoding algorithm is equivalent to belief propagation on a network with loops [18; 17). For some problems in computer vision involving networks with loops, belief propagation has worked well and converged very quickly [7; 6; 21). On the other hand, for other networks with loops, belief propagation gives poor results or fails to converge [21; 29). What has been generally missing has been an idea for how one might system­ atically correct belief propagation in a way that preserves its main advantage-the rapidity with which it normally converges [27). The idea which turned out to be successful was to work out approximations to the Gibbs free energy that are even more accurate than the Bethe approximation, and find corresponding "generalized" belief propagation algorithms. Once one has the idea of improving the approximation for the Gibbs free energy by constraining two-node beliefs like bij(Xi,Xj), it is natural to go further and constrain higher-order beliefs as well. The "cluster variation method," which was invented by Kikuchi [15; 16), is a way of obtaining increasingly accurate approximations in precisely this way. The idea is to group the nodes of the
  • 51. 34 Jonathan S. Yedidia Markov network into basic (possibly overlapping) clusters, and then to compute an approximation to the Gibbs free energy by summing the free energies of the basic clusters, minus the free energy of over-counted intersections of clusters, minus the free energy of over-counted intersections of intersections, and so on. The Bethe approximation is the simplest example of one of these more complicated Kikuchi free energies: for that case, the basic clusters are all the connected pairs of nodes. Every Kikuchi free energy will handle the average energy exactly, and the entropy will become increasingly accurate as the size of the basic clusters increases. Rather than repeat analysis that you can find elsewhere, I will just advertise the results of our work [29]. One can indeed derive new belief propagation algorithms based on Kikuchi free energies. They converge to beliefs that are provably equivalent to the beliefs that are obtained from the Kikuchi stationary conditions. The new messages that need to be introduced involve groups of nodes telling other groups of nodes what their joint beliefs should be. These new belief propagation algorithms have the attractive feature of being user-adjustable: by paying some additional computational cost, you can buy additional accuracy. In practice, the additional cost is not great: we found that we were able to obtain dramatic improvements in accuracy at negligible cost for some models where ordinary belief propagation performs poorly. Acknowledgments It is a pleasure to thank my collaborators Jean-Philippe Bouchaud, Bill Freeman, Antoine Georges, Marc Mezard, and Yair Weiss with whom I have enjoyed exploring the issues described in this chapter. References [l]Bethe H.A., Proc. Royal Soc. of London A, 150, 552, 1935. [2]Bouchaud J.P., Mezard M., Parisi G. and Yedidia J.S., J. Phys. A, 24, L1025, 1991. [3]Bouchaud J.P., Mezard M., and Yedidia J.S. Phys. Rev B 46, 14686, 1992. [4]Derrida. B., Phys. Rev B 24, 2613, 1981. [5]Feynman R.P., Phys. Rev 97, 660,1955. [6]Freeman W.T. and Pasztor E., 7th International Conference Computer Vision, 1182, 1999. [7]Frey B.J. Graphical Models for Machine Learning and Digital Communication, Cambridge: MIT Press, 1998. [8]Georges A., Mezar M.D., and Yedidia J.S., Phys. Rev. Lett. 64, 2937, 1990. [9]Georges A. and Yedidia J.S., Phys. Rev B 43, 3475, 1991. [10]Georges A. and Yedidia J.S., J. Phys. A 24, 2173, 1991. [ll]Jordan M.L, ed., Learning in Graphical Models, Cambridge: MIT Press, 1998. [12]Jordan M.L, Ghahramani Z., Jaakola T., and Saul L.K., Learning in Graphical Models, M.L Jordan ed., Cambridge: MIT Press, 1998. [13]Kabashima Y., and Saad D., Europhys. Lett. 44, 668, 1998. [14]Kabashima Y., and Saad D., Contribution to this volume, 2000. [15]Kikuchi R., Phys. Rev. 81, 988, 1951. [16]Kikuchi R., Special issue in honor of R. Kikuchi, Prog. Theor. Phys. Supp!., 115, 1994. [17]MacKay D.J.C., IEEE Trans. on Inf. Theory, 1999. [18]McEliece R., MacKay D.J.C. and Cheng J., IEEE J. on Sel Areas in Comm. 16 (2), 140, 1998.
  • 52. An Idiosyncratic Journey Beyond Mean Field Theory 35 [19]Mezard M., Parisi G. and Virasoro M.A., Spin Glass Theory and Beyond, Singapore: World Scientific, 1987. [20]Morita T., Physica 98A, 566, 1979. [21]Murphy K., Weiss Y. and Jordan M., in Proc. Uncertainty in AI, 1999. [22]Parisi G. and Potters M., J. Phys. A 28, 5267, 1995. [23]Pearl J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, San Francisco: Morgan Kaufman, 1988. [24]Plefka T., J. Phys. A 15, 1971, 1982. [25]Sherrington D. and Kirpatrick S., Phys. Rev. Lett. 35, 1792, 1975. [26]Thouless D.J., Anderson P.W. and Palmer R.G., Phil. Mag. 35, 593, 1977. [27]Weiss Y., Bayesian Belief Propagation for Image Understanding, available at Yair Weiss's homepage, 1999. [28]Yedidia J.S., 1992 Lectures in Complex Systems, L. Nadel and D. Stein, eds., Addison-Wesley, 299, 1993. [29]Yedidia J.S., Freeman W.T. and Weiss Y., MERL TR2000-26 available at http://www.merl.com/reports/TR2000-26/. 2000. [30]Yedidia J.S. and Georges A., J. Phys. A 23, 2165, 1990. [31]Young A.P., Spin Glasses and Random Fields, World Scientific, ed. 1998.
  • 53. 4 Mean Field Theory for Graphical Models Hilbert J. Kappen and Wim J. Wiegerinck In this chapter, mean field theory is introduced from an information theoretic view point. The mean field approximation is defined as the factorized distribution that is closest to the target distribution. When using the KL divergence to define closeness, this factorized distribution must have equal marginals as the target distribution. Such marginals can be approximately computed by using a Taylor series expansion in the couplings around the factorized distribution. To lowest order in the couplings, the usual naive mean field equations are obtained and to second order, one obtains the TAP equations. An important advantage of this procedure is that it does not require the concept of a free energy. Therefore, it can be applied to arbitrary probability distributions, such as arising in asymmetric stochastic neural networks and graphical models. 1 Introduction During the last few years, the use of probabilistic methods in artificial intelligence and machine learning has gained enormous popularity. In particular, probabilis­ tic graphical models have become the preferred method for knowledge represen­ tation and reasoning [4]. The advantage of the probabilistic approach is that all assumptions are made explicit in the modeling process and that consequences, such as predictions on novel data, are assumption free and follow from a mechanistic computation. The drawback of the probabilistic approach is that the method is intractable. This means that the typical computation scales exponentially with the problem size. Recently, a number of authors have proposed methods for approximate inference in large graphical models. The simplest approach gives a lower bound on the probability of a subset of variables using Jenssen's inequality [14]. The method involves the minimization of the KL divergence between the target probability distribution p and some 'simple' variational distribution q. The method can be applied to a any probability model, whether directed or undirected. The Boltzmann-Gibbs distributions is widely used in physics, and mean field theory has been known for these distributions for a long time. For instance, for the Ising model on a square lattice, it is known as the Bragg-Williams approximation [3] and it is generalized to other models in the Landau theory [10]. One can show that the above lower bound corresponds to the first term in a Taylor series expansion of the free energy around a factorized model. This Taylor series can be continued and the second order term is known as the Thouless Anderson Palmer (TAP) correction [16; 13; 6; 7]. The second order term significantly improves the quality
  • 54. 38 Hilbert J. Kappen and Wim J. Wiegerinck of the approximation, depending on the amount of frustration in the system, but is no longer a bound. For probability distributions that are not Boltzmann-Gibbs distributions, it is not obvious how to obtain the second order approximation. However, there is an alternative way to compute the higher order corrections, based on an information theoretic argument. The general approach to this mean field approximation is introduced in section 2. Before, we work out the mean field approximations for the general case, we first illustrate this idea for Boltzmann distributions in section 3. Subsequently, in section 4 we consider the general case. Finally, in section 5 we illustrate the approach for sigmoid belief networks. 2 Mean field theory In this section we consider a form of mean field theory that was previously proposed by Plefka [13] for Boltzmann-Gibbs distributions. It turns out, however, that the restriction to Boltzmann-Gibbs distributions is not necessary and one can derive results that are valid for arbitrary probability distributions. We therefore consider the general case. Our argument uses an information geometric viewpoint. For an introduction to this approach see for instance [1]. Let x=(Xl'. . . ' x n) be an n-dimensional vector, with Xi taking on discrete values. Let p(x I0) be a probability distribution on x , parametrized by 0. Let P ={p(x I0n be the manifold of all the probability distributions that can be obtained by considering different values of 0. We now assume that that P contains a submanifold of factorized probability distributions in the following sense. We assume that the parametrization 0 is such that it can be divided into two subsets, 0 = (0 , w), and that the submanifold M c P of factorized probability distributions is described by w =o. ° parametrizes the factorized distributions in the manifold M, and w parametrizes the remainder of the manifold P. We will denote factorized distributions by q(x IO) =p(x IO, w =0). Consider an arbitrary probability distribution p(x IO, w) E P. We define its mean field approximation as the factorized distribution q(x lo q) E M that is closest to p(x IO, w). As a distance measure, we use the Kulback-Leibler divergence [1; 17] 1 " p(x IO, w) KL =L..,.p(x IO, w) log q(x lo q) . x (1) Since q(x lo q) is a factorized distribution, q(x lo q) = n�=l qi(Xil0 £), we can find the closest q by differentiating the Kulback-Leibler divergence with respect to these independent components qi(Xil0 £). Using a Lagrange multiplier to ensure normlisation of qi(Xil0 £), one finds that this optimal q must satisfy qi(Xil0 £) =p(x iIO, w), ( 2) where p(x iIO, w) is the marginal distribution of p(x IO, w) on variable Xi. 1 Note, that to obtain the standard variational bound using Jensen's inequality, one employs 'the other' KL divergence with the roles of p and q reversed. As will be outlined below, the KL divergence considered here gives the same result as the Jensen's bound to lowest order.
  • 55. Random documents with unrelated content Scribd suggests to you:
  • 56. conditions which you desired to establish; from this hour begins the new life of which you dreamed. Whether you have been wise or rash, you can change nothing. You are limited, as before, though within a different circle. You may pace it to its fullest extent, but all the lessons you have yet learned require you to be satisfied within it."
  • 58. CHAPTER XIII. PRESENTIMENTS. The autumn lapsed into winter, and the household on the Asten farm began to share the isolation of the season. There had been friendly visits from all the nearest neighbors and friends, followed by return visits, and invitations which Julia willingly accepted. She was very amiable, and took pains to confirm the favorable impression which she knew she had made in the summer. Everybody remarked how she had improved in appearance, how round and soft her neck and shoulders, how bright and fresh her complexion. She thanked them, with many grateful expressions to which they were not accustomed, for their friendly reception, which she looked upon as an adoption into their society; but at home, afterwards, she indulged in criticisms of their manners and habits which were not always friendly. Although these were given in a light, playful tone, and it was sometimes impossible not to be amused, Rachel Miller always felt uncomfortable when she heard them. Then came quiet, lonely days, and Julia, weary of her idle life, undertook to master the details of the housekeeping. She went from garret to cellar, inspecting every article in closet and pantry, wondering much, censuring occasionally, and only praising a little when she found that Rachel was growing tired and irritable. Although she made no material changes, it was soon evident that she had very stubborn views of her own upon many points, and possessed a marked tendency for what the country people call "nearness." Little by little she diminished the bountiful, free-handed
  • 59. manner of provision which had been the habit of the house. One could not say that anything needful was lacking, and Rachel would hardly have been dissatisfied, had she not felt that the innovation was an indirect blame. In some directions Julia seemed the reverse of "near," persuading Joseph into expenditures which the people considered very extravagant. When the snow came, his new and elegant sleigh, with the wolf-skin robe, the silver-mounted harness, and the silver- sounding bells, was the envy of all the young men, and an abomination to the old. It was a splendor which he could easily afford, and he did not grudge her the pleasure; yet it seemed to change his relation to the neighbors, and some of them were very free in hinting that they felt it so. It would be difficult to explain why they should resent this or any other slight departure from their fashions, but such had always been their custom. In a few days the snow vanished and a tiresome season of rain and thaw succeeded. The south-eastern winds, blowing from the Atlantic across the intervening lowlands, rolled interminable gray masses of fog over the hills and blurred the scenery of the valley; dripping trees, soaked meadows, and sodden leaves were the only objects that detached themselves from the general void, and became in turn visible to those who travelled the deep, quaking roads. The social intercourse of the neighborhood ceased perforce, though the need of it were never so great: what little of the main highway down the valley was visible from the windows appeared to be deserted. Julia, having exhausted the resources of the house, insisted on acquainting herself with the barn and everything thereto belonging. She laughingly asserted that her education as a farmer's wife was still very incomplete; she must know the amount of the crops, the price of grain, the value of the stock, the manner of work, and whatever else was necessary to her position. Although she made many pretty blunders, it was evident that her apprehension was unusually quick, and that whatever she acquired was fixed in her
  • 60. mind as if for some possible future use. She never wearied of the most trivial details, while Joseph, on the other hand, would often have willingly shortened his lessons. His mind was singularly disturbed between the desire to be gratified by her curiosity, and the fact that its eager and persistent character made him uncomfortable. When an innocent, confiding nature begins to suspect that its confidence has been misplaced, the first result is a preternatural stubbornness to admit the truth. The clearest impressions are resisted, or half-consciously misinterpreted, with the last force of an illusion which already foresees its own overthrow. Joseph eagerly clung to every look and word and action which confirmed his sliding faith in his wife's sweet and simple character, and repelled—though a deeper instinct told him that a day would come when it must be admitted—the evidence of her coldness and selfishness. Yet, even while almost fiercely asserting to his own heart that he had every reason to be happy, he was consumed with a secret fever of unrest, doubt, and dread. The horns of the growing moon were still turned downwards, and cold, dreary rains were poured upon the land. Julia's patience, in such straits, was wonderful, if the truth had been known, but she saw that some change was necessary for both of them. She therefore proposed, not what she most desired, but what her circumstances prescribed,—a visit from her sister Clementina. Joseph found the request natural enough: it was an infliction, but one which he had anticipated; and after the time had been arranged by letter, he drove to the station to meet the westward train from the city. Clementina stepped upon the platform, so cloaked and hooded that he only recognized her by the deliberate grace of her movements. She extended her hand, giving his a cordial pressure, which was explained by the brass baggage-checks thus transferred to his charge. "I will wait in the ladies' room," was all she said. At the same moment Joseph's arm was grasped.
  • 61. "What a lucky chance!" exclaimed Philip: then, suddenly pausing in his greeting, he lifted his hat and bowed to Clementina, who nodded slightly as she passed into the room. "Let me look at you!" Philip resumed, laying his hands on Joseph's shoulders. Their eyes met and lingered, and Joseph felt the blood rise to his face as Philip's gaze sank more deeply into his heart and seemed to fathom its hidden trouble; but presently Philip smiled and said: "I scarcely knew, until this moment, that I had missed you so much, Joseph!" "Have you come to stay?" Joseph asked. "I think so. The branch railway down the valley, which you know was projected, is to be built immediately; but there are other reasons why the furnaces should be in blast. If it is possible, the work—and my settlement with it—will begin without any further delay. Is she your first family visit?" He pointed towards the station. "She will be with us a fortnight; but you will come, Philip?" "To be sure!" Philip exclaimed. "I only saw her face indistinctly through the veil, but her nod said to me, 'A nearer approach is not objectionable.' Certainly, Miss Blessing; but with all the conventional forms, if you please!" There was something of scorn and bitterness in the laugh which accompanied these words, and Joseph looked at him with a puzzled air. "You may as well know now," Philip whispered, "that when I was a spoony youth of twenty, I very nearly imagined myself in love with Miss Clementina Blessing, and she encouraged my greenness until it spread as fast as a bamboo or a gourd-vine. Of course, I've long since congratulated myself that she cut me up, root and branch, when our family fortune was lost. The awkwardness of our intercourse is all on her side. Can she still have faith in her charms
  • 62. and my youth, I wonder? Ye gods! that would be a lovely conclusion of the comedy!" Joseph could only join in the laugh as they parted. There was no time to reflect upon what had been said. Clementina, nevertheless, assumed a new interest in his eyes; and as he drove her towards the farm, he could not avoid connecting her with Philip in his thoughts. She, too, was evidently preoccupied with the meeting, for Philip's name soon floated to the surface of their conversation. "I expect a visit from him soon," said Joseph. As she was silent, he ventured to add: "You have no objections to meeting with him, I suppose?" "Mr. Held is still a gentleman, I believe," Clementina replied, and then changed the subject of conversation. Julia flew at her sister with open arms, and showered on her a profusion of kisses, all of which were received with perfect serenity, Clementina merely saying, as soon as she could get breath: "Dear me, Julia, I scarcely recognize you! You are already so countrified!" Rachel Miller, although a woman, and notwithstanding her recent experience, found herself greatly bewildered by this new apparition. Clementina's slow, deliberate movements and her even-toned, musical utterance impressed her with a certain respect; yet the qualities of character they suggested never manifested themselves. On the contrary, the same words, in any other mouth, would have often expressed malice or heartlessness. Sometimes she heard her own homely phrases repeated, as if by the most unconscious purposeless imitation, and had Julia either smiled or appeared annoyed her suspicions might have been excited; as it was, she was constantly and sorely puzzled. Once only, and for a moment, the two masks were slightly lifted. At dinner, Clementina, who had turned the conversation upon the subject of birthdays, suddenly said to Joseph: "By the way, Mr. Asten, has Julia told you her age?"
  • 63. Julia gave a little start, but presently looked up, with an expression meant to be artless. "I knew it before we were married," Joseph quietly answered. Clementina bit her lip. Julia, concealing her surprise, flashed a triumphant glance at her sister, then a tender one at Joseph, and said: "We will both let the old birthdays go; we will only have one and the same anniversary from this time on!" Joseph felt, through some natural magnetism of his nature rather than from any perceptible evidence, that Clementina was sharply and curiously watching the relation between himself and his wife. He had no fear of her detecting misgivings which were not yet acknowledged to himself, but was instinctively on his guard in her presence. It was not many days before Philip called. Julia received him cordially, as the friend of her husband, while Clementina bowed with an impassive face, without rising from her seat. Philip, however, crossed the room and gave her his hand, saying cheerily: "We used to be old friends, Miss Blessing. You have not forgotten me?" "We cannot forget when we have been asked to do so," she warbled. Philip took a chair. "Eight years!" he said: "I am the only one who has changed in that time." Julia, looked at her sister, but the latter was apparently absorbed in comparing some zephyr tints. "The whirligig of time!" he exclaimed: "who can foresee anything? Then I was an ignorant, petted young aristocrat,—an expectant heir; now behold me, working among miners and puddlers and forgemen! It's a rough but wholesome change. Would you believe it, Mrs. Asten, I've forgotten the mazurka!" "I wish to forget it," Julia replied: "the spring-house is as important to me as the furnace to you."
  • 64. "Have you seen the Hopetons lately?" Clementina asked. Joseph saw a shade pass over Philip's face, and he seemed to hesitate a moment before answering: "I hear they will be neighbors of mine next summer. Mr. Hopeton is interested in the new branch down the valley, and has purchased the old Calvert property for a country residence." "Indeed? Then you will often see them." "I hope so: they are very agreeable people. But I shall also have my own little household: my sister will probably join me." "Not Madeline!" exclaimed Julia. "Madeline," Philip answered. "It has long been her wish, as well as mine. You know the little cottage on the knoll, at Coventry, Joseph! I have taken it for a year." "There will be quite a city society," murmured Clementina, in her sweetest tones. "You will need no commiseration, Julia. Unless, indeed, the country people succeed in changing you all into their own likeness. Mrs. Hopeton will certainly create a sensation. I am told that she is very extravagant, Mr. Held?" "I have never seen her husband's bank account," said Philip, dryly. He rose presently, and Joseph accompanied him to the lane. Philip, with the bridle-rein over his arm, delayed to mount his horse, while the mechanical commonplaces of speech, which, somehow, always absurdly come to the lips when graver interests have possession of the heart, were exchanged by the two. Joseph felt, rather than saw, that Philip was troubled. Presently the latter said: "Something is coming over both of us,—not between us. I thought I should tell you a little more, but perhaps it is too soon. If I guess rightly, neither of us is ready. Only this, Joseph, let us each think of the other as a help and a support!" "I do, Philip!" Joseph answered. "I see there is some influence at work which I do not understand, but I am not impatient to know
  • 65. what it is. As for myself, I seem to know nothing at all; but you can judge,—you see all there is." Even as he pronounced these words Joseph felt that they were not strictly sincere, and almost expected to find an expression of reproof in Philip's eyes. But no: they softened until he only saw a pitying tenderness. Then he knew that the doubts which he had resisted with all the force of his nature were clearly revealed to Philip's mind. They shook hands, and parted in silence; and Joseph, as he looked up to the gray blank of heaven, asked himself: "Is this all? Has my life already taken the permanent imprint of its future?"
  • 67. CHAPTER XIV. THE AMARANTH. Clementina returned to the city without having made any very satisfactory discovery. Her parting was therefore conventionally tender: she even thanked Joseph for his hospitality, and endeavored to throw a little natural emphasis into her words as she expressed the hope of being allowed to renew her visit in the summer. During her stay it seemed to Joseph that the early harmony of his household had been restored. Julia's manner had been so gentle and amiable, that, on looking back, he was inclined to believe that the loneliness of her new life was alone responsible for any change. But after Clementina's departure his doubts were reawakened in a more threatening form. He could not guess, as yet, the terrible chafing of a smiling mask; of a restraint which must not only conceal itself, but counterfeit its opposite; of the assumption by a narrow, cold, and selfish nature of virtues which it secretly despises. He could not have foreseen that the gentleness, which had nearly revived his faith in her, would so suddenly disappear. But it was gone, like a glimpse of the sun through the winter fog. The hard, watchful expression came back to Julia's face; the lowered eyelids no longer gave a fictitious depth to her shallow, tawny pupils; the soft roundness of her voice took on a frequent harshness, and the desire of asserting her own will in all things betrayed itself through her affected habits of yielding and seeking counsel. She continued her plan of making herself acquainted with all the details of the farm business. When the roads began to improve, in
  • 68. the early spring, she insisted in driving to the village alone, and Joseph soon found that she made good use of these journeys in extending her knowledge of the social and pecuniary standing of all the neighboring families. She talked with farmers, mechanics, and drovers; became familiar with the fluctuations in the prices of grain and cattle; learned to a penny the wages paid for every form of service; and thus felt, from week to week, the ground growing more secure under her feet. Joseph was not surprised to see that his aunt's participation in the direction of the household gradually diminished. Indeed, he scarcely noticed the circumstance at all, but he was at last forced to remark her increasing silence and the trouble of her face. To all appearance the domestic harmony was perfect, and if Rachel Miller felt some natural regret at being obliged to divide her sway, it was a matter, he thought, wherein he had best not interfere. One day, however, she surprised him by the request:— "Joseph, can you take or send me to Magnolia to-morrow?" "Certainly, Aunt!" he replied. "I suppose you want to visit Cousin Phebe; you have not seen her since last summer." "It was that,—and something more." She paused a moment, and then added, more firmly: "She has always wished that I should make my home with her, but I couldn't think of any change so long as I was needed here. It seems to me that I am not really needed now." "Why, Aunt Rachel!" Joseph exclaimed, "I meant this to be your home always, as much as mine! Of course you are needed,—not to do all that you have done heretofore, but as a part of the family. It is your right." "I understand all that, Joseph. But I've heard it said that a young wife should learn to see to everything herself, and Julia, I'm sure, doesn't need either my help or my advice." Joseph's face became very grave. "Has she—has she—?" he stammered.
  • 69. "No," said Rachel, "she has not said it—in words. Different persons have different ways. She is quick, O very quick!—and capable. You know I could never sit idly by, and look on; and it's hard to be directed. I seem to belong to the place and everything connected with it; yet there's times when what a body ought to do is plain." In endeavoring to steer a middle course between her conscience and her tender regard for her nephew's feelings Rachel only confused and troubled him. Her words conveyed something of the truth which she sought to hide under them. She was both angered and humiliated; the resistance with which she had attempted to meet Julia's domestic innovations was no match for the latter's tactics; it had gone down like a barrier of reeds and been contemptuously trampled under foot. She saw herself limited, opposed, and finally set aside by a cheerful dexterity of management which evaded her grasp whenever she tried to resent it. Definite acts, whereon to base her indignation, seemed to slip from her memory, but the atmosphere of the house became fatal to her. She felt this while she spoke, and felt also that Joseph must be spared. "Aunt Rachel," said he, "I know that Julia is very anxious to learn everything which she thinks belongs to her place,—perhaps a little more than is really necessary. She's an enthusiastic nature, you know. Maybe you are not fully acquainted yet; maybe you have misunderstood her in some things: I would like to think so." "It is true that we are different, Joseph,—very different. I don't say, therefore, that I'm always right. It's likely, indeed, that any young wife and any old housekeeper like myself would have their various notions. But where there can be only one head, it's the wife's place to be that head. Julia has not asked it of me, but she has the right. I can't say, also, that I don't need a little rest and change, and there seems to be some call on me to oblige Phebe. Look at the matter in the true light," she continued, seeing that Joseph remained silent, "and you must feel that it's only natural."
  • 70. "I hope so," he said at last, repressing a sigh; "all things are changing." "What can we do?" Julia asked, that evening, when he had communicated to her his aunt's resolution; "it would be so delightful if she would stay, and yet I have had a presentiment that she would leave us—for a little while only, I hope. Dear, good Aunt Rachel! I couldn't help seeing how hard it was for her to allow the least change in the order of housekeeping. She would be perfectly happy if I would sit still all day and let her tire herself to death; but how can I do that, Joseph? And no two women have exactly the same ways and habits. I've tried to make everything pleasant for her: if she would only leave many little matters entirely to me, or at least not think of them,—but I fear she cannot. She manages to see the least that I do, and secretly worries about it, in the very kindness of her heart. Why can't women carry on partnerships in housekeeping as men do in business? I suppose we are too particular; perhaps I am just as much so as Aunt Rachel. I have no doubt she thinks a little hardly of me, and so it would do her good—we should really come nearer again—if she had a change. If she will go, Joseph, she must at least leave us with the feeling that our home is always hers, whenever she chooses to accept it." Julia bent over Joseph's chair, gave him a rapid kiss, and then went off to make her peace with Aunt Rachel. When the two women came to the tea-table the latter had an uncertain, bewildered air, while the eyelids of the former were red,—either from tears or much rubbing. A fortnight afterwards Rachel Miller left the farm and went to reside with her widowed niece, in Magnolia. The day after her departure another surprise came to Joseph in the person of his father-in-law. Mr. Blessing arrived in a hired vehicle from the station. His face was so red and radiant from the March winds, and perhaps some private source of satisfaction, that his sudden arrival could not possibly be interpreted as an omen of ill- fortune. He shook hands with the Irish groom who had driven him
  • 71. over, gave him a handsome gratuity in addition to the hire of the team, extracted an elegant travelling-satchel from under the seat, and met Joseph at the gate, with a breezy burst of feeling:— "God bless you, son-in-law! It does my heart good to see you again! And then, at last, the pleasure of beholding your ancestral seat; really, this is quite—quite manorial!" Julia, with a loud cry of "O pa!" came rushing from the house. "Bless me, how wild and fresh the child looks!" cried Mr. Blessing, after the embrace. "Only see the country roses on her cheeks! Almost too young and sparkling for Lady Asten, of Asten Hall, eh? As Dryden says, 'Happy, happy, happy pair!' It takes me back to the days when I was a gay young lark; but I must have a care, and not make an old fool of myself. Let us go in and subside into soberness: I am ready both to laugh and cry." When they were seated in the comfortable front room, Mr. Blessing opened his satchel and produced a large leather-covered flask. Julia was probably accustomed to his habits, for she at once brought a glass from the sideboard. "I am still plagued with my old cramps," her father said to Joseph, as he poured out a stout dose. "Physiologists, you know, have discovered that stimulants diminish the wear and tear of life, and I find their theories correct. You, in your pastoral isolation and pecuniary security, can form no conception of the tension under which we men of office and of the world live, Beatus ille, and so forth,—strange that the only fragment of Latin which I remember should be so appropriate! A little water, if you please, Julia." In the evening, when Mr. Blessing, slippered, sat before the open fireplace, with a cigar in his mouth, the object of his sudden visit crept by slow degrees to the light. "Have you been dipping into oil?" he asked Joseph. Julia made haste to reply. "Not yet, but almost everybody in the neighborhood is ready to do so now, since Clemson has realized his fifty thousand dollars in a single year. They are talking of nothing
  • 72. else in the village. I heard yesterday, Joseph, that Old Bishop has taken three thousand dollars' worth of stock in a new company." "Take my advice, and don't touch 'em!" exclaimed Mr. Blessing. "I had not intended to," said Joseph. "There is this thing about these excitements," Mr. Blessing continued: "they never reach the rural districts until the first sure harvest is over. The sharp, intelligent operators in the large cities— the men who are ready to take up soap, thimbles, hand-organs, electricity, or hymn-books, at a moment's notice—always cut into a new thing before its value is guessed by the multitude. Then the smaller fry follow and secure their second crop, while your quiet men in the country are shaking their heads and crying 'humbug!' Finally, when it really gets to be a humbug, in a speculative sense, they just begin to believe in it, and are fair game for the bummers and camp- followers of the financial army. I respect Clemson, though I never heard of him before; as for Old Bishop, he may be a very worthy man, but he'll never see the color of his three thousand dollars again." "Pa!" cried Julia, "how clear you do make everything. And to think that I was wishing—O, wishing so much!—that Joseph would go into oil." She hung her head a little, looking at Joseph with an affectionate, penitent glance. A quick gleam of satisfaction passed over Mr. Blessing's face; he smiled to himself, puffed rapidly at his cigar for a minute, and then resumed: "In such a field of speculation everything depends on being initiated. There are men in the city—friends of mine—who know every foot of ground in the Alleghany Valley. They can smell oil, if it's a thousand feet deep. They never touch a thing that isn't safe,—but, then, they know what's safe. In spite of the swindling that's going on, it takes years to exhaust the good points; just so sure as your honest neighbors here will lose, just so sure will these friends of mine gain. There are millions in what they have under way, at this moment."
  • 73. "What is it?" Julia breathlessly asked, while Joseph's face betrayed that his interest was somewhat aroused. Mr. Blessing unlocked his satchel, and took from it a roll of paper, which he began to unfold upon his knee. "Here," he said, "you see this bend of the river, just about the centre of the oil region, which is represented by the yellow color. These little dots above the bend are the celebrated Fluke Wells; the other dots below are the equally celebrated Chowder Wells. The distance between the two is nearly three miles. Here is an untouched portion of the treasure,—a pocket of Pactolus waiting to be rifled. A few of us have acquired the land, and shall commence boring immediately." "But," said Joseph, "it seems to me that either the attempt must have been made already, or that the land must command such an enormous price as to lessen the profits." "Wisely spoken! It is the first question which would occur to any prudent mind. But what if I say that neither is the case? And you, who are familiar with the frequent eccentricities of old farmers, can understand the explanation. The owner of the land was one of your ignorant, stubborn men, who took such a dislike to the prospectors and speculators, that he refused to let them come near him. Both the Fluke and Chowder Companies tried their best to buy him out, but he had a malicious pleasure in leading them on to make immense offers, and then refusing. Well, a few months ago he died, and his heirs were willing enough to let the land go; but before it could be regularly offered for sale, the Fluke and Chowder Wells began to flow less and less. Their shares fell from 270 to 95; the supposed value of the land fell with them, and finally the moment arrived when we could purchase for a very moderate sum. I see the question in your mind; why should we wish to buy when the other wells were giving out? There comes in the secret, which is our veritable success. Consider it whispered in your ears, and locked in your bosoms,—torpedoes! It was not then generally exploded (to carry out the image), so we bought at the low figure, in the very nick of time. Within a week the Fluke and Chowder Wells were
  • 74. torpedoed, and came back to more than their former capacity; the shares rose as rapidly as they had fallen, and the central body we hold—to which they are, as it were, the two arms—could now be sold for ten times what it cost us!" Here Mr. Blessing paused, with his finger on the map, and a light of merited triumph in his eyes. Julia clapped her hands, sprang to her feet, and cried: "Trumps at last!" "Ay," said he, "wealth, repose for my old days,—wealth for us all, if your husband will but take the hand I hold out to him. You now know, son-in-law, why the endorsement you gave me was of such vital importance; the note, as you are aware, will mature in another week. Why should you not charge yourself with the payment, in consideration of the transfer to you of shares of the original stock, already so immensely appreciated in value? I have delayed making any provision, for the sake of offering you the chance." Julia was about to speak, but restrained herself with an apparent effort. "I should like to know," Joseph said, "who are associated with you in the undertaking?" "Well done, again! Where did you get your practical shrewdness? The best men in the city!—not only the Collector and the Surveyor, but Congressman Whaley, E. D. Stokes, of Stokes, Pirricutt and Company, and even the Reverend Doctor Lellifant. If I had not been an old friend of Kanuck, the agent who negotiated the purchase, my chance would have been impalpably small. I have all the documents with me. There has been no more splendid opportunity since oil became a power! I hesitate to advise even one so near to me in such matters; but if you knew the certainties as I know them, you would go in with all your available capital. The excitement, as you say, has reached the country communities, which are slow to rise and equally slow to subside; all oil stock will be in demand, but the Amaranth,—'The Blessing,' they wished to call it, but I was obliged
  • 75. to decline, for official reasons,—the Amaranth shares will be the golden apex of the market!" Julia looked at Joseph with eager, hungry eyes. He, too, was warmed and tempted by the prospect of easy profit which the scheme held out to him; only the habit of his nature resisted, but with still diminishing force. "I might venture the thousand," he said. "It is no venture!" Julia cried. "In all the speculations I have heard discussed by pa and his friends, there was nothing so admirably managed as this. Such a certainty of profit may never come again. If you will be advised by me, Joseph, you will take shares to the amount of five or ten thousand." "Ten thousand is exactly the amount I hold open," Mr. Blessing gravely remarked. "That, however, does not represent the necessary payment, which can hardly amount to more than twenty-five per cent. before we begin to realize. Only ten per cent. has yet been called, so that your thousand at present will secure you an investment of ten thousand. Really, it seems like a fortunate coincidence." He went on, heating himself with his own words, until the possibilities of the case grew so splendid that Joseph felt himself dazzled and bewildered. Mr. Blessing was a master in the art of seductive statement. Even where he was only the mouthpiece of another, a few repetitions led him to the profoundest belief. Here there could be no doubt of his sincerity, and, moreover, every movement from the very inception of the scheme, every statistical item, all collateral influences, were clear in his mind and instantly accessible. Although he began by saying, "I will make no estimate of the profits, because it is not prudent to fix our hopes on a positive sum," he was soon carried far away from this resolution, and most luxuriously engaged, pencil in hand, in figuring out results which drove Julia wild with desire, and almost took away Joseph's breath. The latter finally said, as they rose from the session, late at night:—
  • 76. "It is settled that I take as much as the thousand will cover; but I would rather think over the matter quietly for a day or two before venturing further." "You must," replied Mr. Blessing, patting him on the shoulder. "These things are so new to your experience, that they disturb and— I might almost say—alarm you. It is like bringing an increase of oxygen into your mental atmosphere. (Ha! a good figure: for the result will be, a richer, fuller life. I must remember it.) But you are a healthy organization, and therefore you are certain to see clearly: I can wait with confidence." The next morning Joseph, without declaring his purpose, drove to Coventry Forge to consult Philip. Mr. Blessing and Julia, remaining at home, went over the shining ground again, and yet again, confirming each other in the determination to secure it. Even Joseph, as he passed up the valley in the mild March weather, taking note of the crimson and gold of the flowering spice-bushes and maple-trees, could not prevent his thoughts from dwelling on the delights of wealth,—society, books, travel, and all the mellow, fortunate expansion of life. Involuntarily, he hoped that Philip's counsel might coincide with his father-in-law's offer. But Philip was not at home. The forge was in full activity, the cottage on the knoll was repainted and made attractive in various ways, and Philip would soon return with his sister to establish a permanent home. Joseph found the sign-spiritual of his friend in numberless little touches and changes; it seemed to him that a new soul had entered into the scenery of the place. A mile or two farther up the valley, a company of mechanics and laborers were apparently tearing the old Calvert mansion inside out. House, barn, garden, and lawn were undergoing a complete transformation. While he paused at the entrance of the private lane, to take a survey of the operations, Mr. Clemson rode down to him from the house. The Hopetons, he said, would migrate from the city early in May: work had already commenced on the new railway, and
  • 77. in another year a different life would come upon the whole neighborhood. In the course of the conversation Joseph ventured to sound Mr. Clemson in regard to the newly formed oil companies. The latter frankly confessed that he had withdrawn from further speculation, satisfied with his fortune; he preferred to give no opinion, further than that money was still to be made, if prudently placed. Tho Fluke and Chowder Wells, he said, were old, well known, and profitable. The new application of torpedoes had restored their failing flow, and the stock had recovered from its temporary depreciation. His own venture had been made in another part of the region. The atmosphere into which Joseph entered, on returning home, took away all further power of resistance. Tempted already, and impressed by what he had learned, he did what his wife and father- in-law desired.
  • 79. CHAPTER XV. A DINNER PARTY. Having assumed the payment of Mr. Blessing's note, as the first instalment upon his stock, Joseph was compelled to prepare himself for future emergencies. A year must still elapse before the term of the mortgage upon his farm would expire, but the sums he had invested for the purpose of meeting it when due must be held ready for use. The assurance of great and certain profit in the mean time rendered this step easy; and, even at the worst, he reflected, there would be no difficulty in procuring a new mortgage whereby to liquidate the old. A notice which he received at this time, that a second assessment of ten per cent. on the Amaranth stock had been made, was both unexpected and disquieting. Mr. Blessing, however, accompanied it with a letter, making clear not only the necessity, but the admirable wisdom of a greater present outlay than had been anticipated. So the first of April—the usual business anniversary of the neighborhood—went smoothly by. Money was plenty, the Asten credit had always been sound, and Joseph tasted for the first time a pleasant sense of power in so easily receiving and transferring considerable sums. One result of the venture was the development of a new phase in Julia's nature. She not only accepted the future profit as certain, but she had apparently calculated its exact amount and framed her plans accordingly. If she had been humiliated by the character of Joseph's first business transaction with her father, she now made amends for it. "Pa" was their good genius. "Pa" was the agency
  • 80. whereby they should achieve wealth and social importance. Joseph now had the clearest evidence of the difference between a man who knew the world and was of value in it, and their slow, dull-headed country neighbors. Indeed, Julia seemed to consider the Asten property as rather contemptible beside the splendor of the Blessing scheme. Her gratitude for a quiet home, her love of country life, her disparagement of the shams and exactions of "society," were given up as suddenly and coolly as if she had never affected them. She gave herself no pains to make the transition gradual, and thus lessen its shock. Perhaps she supposed that Joseph's fresh, unsuspicious nature was so plastic that it had already sufficiently taken her impress, and that he would easily forget the mask she had worn. If so, she was seriously mistaken. He saw, with a deadly chill of the heart, the change in her manner, —a change so complete that another face confronted him at the table, even as another heart beat beside his on the dishallowed marriage-bed. He saw the gentle droop vanish from the eyelids, leaving the cold, flinty pupils unshaded; the soft appeal of the half- opened lips was lost in the rigid, almost cruel compression which now seemed habitual to them; all the slight dependent gestures, the tender airs of reference to his will or pleasure, had rapidly transformed themselves into expressions of command or obstinate resistance. But the patience of a loving man is equal to that of a loving woman: he was silent, although his silence covered an ever- increasing sense of outrage. Once it happened, that after Julia had been unusually eloquent concerning "what pa is doing for us," and what use they should make of "pa's money, as I call it," Joseph quietly remarked:— "You seem to forget, Julia, that without my money not much could have been done." An angry color came into her face; but, on second thought, she bent her head, and murmured in an offended voice: "It is very mean and ungenerous in you to refer to our temporary poverty. You might forget, by this time, the help pa was compelled to ask of you."
  • 81. "I did not think of that!" he exclaimed. "Besides, you did not seem entirely satisfied with my help, at the time." "O, how you misunderstand me!" she groaned. "I only wished to know the extent of his need. He is so generous, so considerate towards us, that we only guess his misfortune at the last moment." The possibility of being unjust silenced Joseph. There were tears in Julia's voice, and he imagined they would soon rise to her eyes. After a long, uncomfortable pause, he said, for the sake of changing the subject: "What can have become of Elwood Withers? I have not seen him for months." "I don't think you need care to know," she remarked. "He's a rough, vulgar fellow: it's just as well if he keeps away from us." "Julia! he is my friend, and must always be welcome to me. You were friendly enough towards him, and towards all the neighborhood, last summer: how is it that you have not a good word to say now?" He spoke warmly and indignantly. Julia, however, looked at him with a calm, smiling face. "It is very simple," she said. "You will agree with me, in another year. A guest, as I was, must try to see only the pleasant side of people: that's our duty; and so I enjoyed— as much as I could—the rusticity, the awkwardness, the ignorance, the (now, don't be vexed, dear!)—the vulgarity of your friend. As one of the society of the neighborhood, as a resident, I am not bound by any such delicacy. I take the same right to judge and select as I should take anywhere. Unless I am to be hypocritical, I cannot—towards you, at least—conceal my real feelings. How shall I ever get you to see the difference between yourself and these people, unless I continually point it out? You are modest, and don't like to acknowledge your own superiority." She rose from the table, laughing, and went out of the room humming a lively air, leaving Joseph to make the best of her words. A few days after this the work on the branch railway, extending down the valley, reached a point where it could be seen from the
  • 82. Asten farm. Joseph, on riding over to inspect the operations, was surprised to find Elwood, who had left his father's place and become a sub-contractor. The latter showed his hearty delight at their meeting. "I've been meaning to come up," he said, "but this is a busy time for me. It's a chance I couldn't let slip, and now that I've taken hold I must hold on. I begin to think this is the thing I was made for, Joseph." "I never thought of it before," Joseph answered, "and yet I'm sure you are right. How did you hit upon it?" "I didn't; it was Mr. Held." "Philip?" "Him. You know I've been hauling for the Forge, and so it turned up by degrees, as I may say. He's at home, and, I expect, looking for you. But how are you now, really?" Elwood's question meant a great deal more than he knew how to say. Suddenly, in a flash of memory, their talk of the previous year returned to Joseph's mind; he saw his friend's true instincts and his own blindness as never before. But he must dissemble, if possible, with that strong, rough, kindly face before him. "O," he said, attempting a cheerful air, "I am one of the old folks now. You must come up—" The recollection of Julia's words cut short the invitation upon his lips. A sharp pang went through his heart, and the treacherous blood crowded to his face all the more that he tried to hold it back. "Come, and I'll show you where we're going to make the cutting," Elwood quietly said, taking him by the arm. Joseph fancied, thenceforth, that there was a special kindness in his manner, and the suspicion seemed to rankle in his mind as if he had been slighted by his friend.
  • 83. As before, to vary the tedium of his empty life, so now, to escape from the knowledge which he found himself more and more powerless to resist, he busied himself beyond all need with the work of the farm. Philip had returned with his sister, he knew, but after the meeting with Elwood he shrank with a painful dread from Philip's heart-deep, intimate eye. Julia, however, all the more made use of the soft spring weather to survey the social ground, and choose where to take her stand. Joseph scarcely knew, indeed, how extensive her operations had been, until she announced an invitation to dine with the Hopetons, who were now in possession of the renovated Calvert place. She enlarged, more than was necessary, on the distinguished city position of the family, and the importance of "cultivating" its country members. Joseph's single brief meeting with Mr. Hopeton—who was a short, solid man, in ripe middle age, of a thoroughly cosmopolitan, though not a remarkably intellectual stamp —had been agreeable, and he recognized the obligation to be neighborly. Therefore he readily accepted the invitation on his own grounds. When the day arrived, Julia, after spending the morning over her toilet, came forth resplendent in rosy silk, bright and dazzling in complexion, and with all her former grace of languid eyelids and parted lips. The void in Joseph's heart grew wider at the sight of her; for he perceived, as never before, her consummate skill in assuming a false character. It seemed incredible that he should have been so deluded. For the first time a feeling of repulsion, which was almost disgust, came upon him as he listened to her prattle of delight in the soft weather, and the fragrant woods, and the blossoming orchards. Was not, also, this delight assumed? he asked himself: false in one thing, false in all, was the fatal logic which then and there began its torment. The most that was possible in such a short time had been achieved on the Calvert place. The house had been brightened, surrounded by light, airy verandas, and the lawn and garden, thrown into one and given into the hands of a skilful gardener, were scarcely to be recognized. A broad, solid gravel-walk replaced the old tan-
  • 84. covered path; a pretty fountain tinkled before the door; thick beds of geranium in flower studded the turf, and veritable thickets of rose- trees were waiting for June. Within the house, some rooms had been thrown together, the walls richly yet harmoniously colored, and the sumptuous furniture thus received a proper setting. In contrast to the houses of even the wealthiest farmers, which expressed a nicely reckoned sufficiency of comfort, the place had an air of joyous profusion, of a wealth which delighted in itself. Mr. Hopeton met them with the frank, offhand manner of a man of business. His wife followed, and the two guests made a rapid inspection of her as she came down the hall. Julia noticed that her crocus-colored dress was high in the neck, and plainly trimmed; that she wore no ornaments, and that the natural pallor of her complexion had not been corrected by art. Joseph remarked the simple grace of her movement, the large, dark, inscrutable eyes, the smooth bands of her black hair, and the pure though somewhat lengthened oval of her face. The gentle dignity of her manner more than refreshed, it soothed him. She was so much younger than her husband that Joseph involuntarily wondered how they should have come together. The greetings were scarcely over before Philip and Madeline Held arrived. Julia, with the least little gush of tenderness, kissed the latter, whom Philip then presented to Joseph for the first time. She had the same wavy hair as her brother, but the golden hue was deepened nearly into brown, and her eyes were a clear hazel. It was also the same frank, firm face, but her woman's smile was so much the sweeter as her lips were lovelier than the man's. Joseph seemed to clasp an instant friendship in her offered hand. There was but one other guest, who, somewhat to his surprise, was Lucy Henderson. Julia concealed whatever she might have felt, and made so much reference to their former meetings as might satisfy Lucy without conveying to Mrs. Hopeton the impression of any special intimacy. Lucy looked thin and worn, and her black silk dress was not of the latest fashion: she seemed to be the poor
  • 85. relation of the company. Joseph learned that she had taken one of the schools in the valley, for the summer. Her manner to him was as simple and friendly as ever, but he felt the presence of some new element of strength and self-reliance in her nature. His place at dinner was beside Mrs. Hopeton, while Lucy— apparently by accident—sat upon the other side of the hostess. Philip and the host led the conversation, confining it too exclusively to the railroad and iron interests; but those finally languished, and gave way to other topics in which all could take part. Joseph felt that while the others, except Lucy and himself, were fashioned under different aspects of life, some of which they shared in common, yet that their seeming ease and freedom of communication touched, here and there, some invisible limit, which they were careful not to pass. Even Philip appeared to be beyond his reach, for the time. The country and the people, being comparatively new to them, naturally came to be discussed. "Mr. Held, or Mr. Asten,—either of you know both,"—Mr. Hopeton asked, "what are the principal points of difference between society in the city and in the country?" "Indeed, I know too little of the city," said Joseph. "And I know too little of the country,—here, at least," Philip added. "Of course the same passions and prejudices come into play everywhere. There are circles, there are jealousies, ups and downs, scandals, suppressions, and rehabilitations: it can't be otherwise." "Are they not a little worse in the country," said Julia, "because—I may ask the question here, among us—there is less refinement of manner?" "If the external forms are ruder," Philip resumed, "it may be an advantage, in one sense. Hypocrisy cannot be developed into an art." Julia bit her lip, and was silent.
  • 86. "But are the country people, hereabouts, so rough?" Mrs. Hopeton asked. "I confess that they don't seem so to me. What do you say, Miss Henderson?" "Perhaps I am not an impartial witness," Lucy answered. "We care less about what is called 'manners' than the city people. We have no fixed rules for dress and behavior,—only we don't like any one to differ too much from the rest of us." "That's it!" Mr. Hopeton cried; "the tyrannical levelling sentiment of an imperfectly developed community! Fortunately, I am beyond its reach." Julia's eyes sparkled: she looked across the table at Joseph, with a triumphant air. Philip suddenly raised his head. "How would you correct it? Simply by resistance?" he asked. Mr. Hopeton laughed. "I should no doubt get myself into a hornet's-nest. No; by indifference!" Then Madeline Held spoke. "Excuse me," she said; "but is indifference possible, even if it were right? You seem to take the levelling spirit for granted, without looking into its character and causes; there must be some natural sense of justice, no matter how imperfectly society is developed. We are members of this community,—at least, Philip and I certainly consider ourselves so,— and I am determined not to judge it without knowledge, or to offend what may be only mechanical habits of thought, unless I can see a sure advantage in doing so." Lucy Henderson looked at the speaker with a bright, grateful face. Joseph's eyes wandered from her to Julia, who was silent and watchful. "But I have no time for such conscientious studies," Mr. Hopeton resumed. "One can be satisfied with half a dozen neighbors, and let the mass go. Indifference, after all, is the best philosophy. What do you say, Mr. Held?"
  • 87. "Indifference!" Philip echoed. A dark flush came into his face, and he was silent a moment. "Yes: our hearts are inconvenient appendages. We suffer a deal from unnecessary sympathies, and from imagining, I suppose, that others feel them as we do. These uneasy features of society are simply the effort of nature to find some occupation for brains otherwise idle—or empty. Teach the people to think, and they will disappear." Joseph stared at Philip, feeling that a secret bitterness was hidden under his careless, mocking air. Mrs. Hopeton rose, and the company left the table. Madeline Held had a troubled expression, but there was an eager, singular brightness in Julia's eyes. "Emily, let us have coffee on the veranda," said Mr. Hopeton, leading the way. He had already half forgotten the subject of conversation: his own expressions, in fact, had been made very much at random, for the sole purpose of keeping up the flow of talk. He had no very fixed views of any kind, beyond the sphere of his business activity. Philip, noticing the impression he had made on Joseph, drew him to one side. "Don't seriously remember my words against me," he said; "you were sorry to hear them, I know. All I meant was, that an over-sensitive tenderness towards everybody is a fault. Besides, I was provoked to answer him in his own vein." "But, Philip!" Joseph whispered, "such words tempt me! What if they were true?" Philip grasped his arm with a painful force. "They never can be true to you, Joseph," he said. Gay and pleasant as the company seemed to be, each one felt a secret sense of relief when it came to an end. As Joseph drove homewards, silently recalling what had been said, Julia interrupted his reflections with: "Well, what do you think of the Hopetons?" "She is an interesting woman," he answered.
  • 88. "But reserved; and she shows very little taste in dress. However, I suppose you hardly noticed anything of the kind. She kept Lucy Henderson beside her as a foil: Madeline Held would have been damaging." Joseph only partly guessed her meaning; it was repugnant, and he determined to avoid its further discussion. "Hopeton is a shrewd business man," Julia continued, "but he cannot compare with her for shrewdness—either with her or—Philip Held!" "What do you mean?" "I made a discovery before the dinner was over, which you— innocent, unsuspecting man that you are—might have before your eyes for years, without seeing it. Tell me now, honestly, did you notice nothing?" "What should I notice, beyond what was said?" he asked. "That was the least!" she cried; "but, of course, I knew you couldn't. And perhaps you won't believe me, when I tell you that Philip Held,—your particular friend, your hero, for aught I know your pattern of virtue and character, and all that is manly and noble,— that Philip Held, I say, is furiously in love with Mrs. Hopeton!" Joseph started as if he had been shot, and turned around with an angry red on his brow. "Julia!" he said, "how dare you speak so of Philip!" She laughed. "Because I dare to speak the truth, when I see it. I thought I should surprise you. I remembered a certain rumor I had heard before she was married,—while she was Emily Marrable,—and I watched them closer than they guessed. I'm certain of Philip: as for her, she's a deep creature, and she was on her guard; but they are near neighbors." Joseph was thoroughly aroused and indignant. "It is your own fancy!" he exclaimed. "You hate Philip on account of that affair with
  • 89. Clementina; but you ought to have some respect for the woman whose hospitality you have accepted!" "Bless me! I have any quantity of respect both for her and her furniture. By the by, Joseph, our parlor would furnish better than hers; I have been thinking of a few changes we might make, which would wonderfully improve the house. As for Philip, Clementina was a fool. She'd be glad enough to have him now, but in these matters, once gone is gone for good. Somehow, people who marry for love very often get rich afterwards,—ourselves, for instance." It was some time before Joseph's excitement subsided. He had resented Julia's suspicion as dishonorable to Philip, yet he could not banish the conjecture of its possible truth. If Philip's affected cynicism had tempted him, Julia's unblushing assumption of the existence of a passion which was forbidden, and therefore positively guilty, seemed to stain the pure texture of his nature. The lightness with which she spoke of the matter was even more abhorrent to him than the assertion itself; the malicious satisfaction in the tones of her voice had not escaped his ear. "Julia," he said, just before they reached home, "do not mention your fancy to another soul than me. It would reflect discredit on you." "You are innocent," she answered. "And you are not complimentary. If I have any remarkable quality, it is tact. Whenever I speak, I shall know the effect before-hand; even pa, with all his official experience, is no match for me in this line. I see what the Hopetons are after, and I mean to show them that we were first in the field. Don't be concerned, you good, excitable creature, you are no match for such well-drilled people. Let me alone, and before the summer is over we will give the law to the neighborhood!"
  • 90. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com