\magnification=1200 \def\qed{\unskip\kern 6pt\penalty 500\raise -2pt\hbox {\vrule\vbox to 10pt{\hrule width 4pt\vfill\hrule}\vrule}} \null\vskip 4truecm \centerline{POSITIVITY OF ENTROPY PRODUCTION} \bigskip \centerline{IN NONEQUILIBRIUM STATISTICAL MECHANICS.} \bigskip \bigskip \centerline{by David Ruelle\footnote{*}{IHES (91440 Bures sur Yvette, France), and Math. Dept., Rutgers University (New Brunswick, NJ 08903, USA).}} \bigskip \bigskip \indent {\it Abstract.} We analyze different mechanisms of entropy production in statistical mechanics, and propose formulae for the entropy production rate $e(\mu)$ in a state $\mu$. When $\mu$ is a steady state describing the long term behavior of a system we show that $e(\mu)\ge0$, and sometimes we can prove $e(\mu)>0$. \bigskip \bigskip \indent {\it Key words and phrases:} ensemble, entropy production, folding entropy, nonequilibrium stationary state, nonequilibrium statistical mechanics, SRB state, thermostat. \vfill\eject \centerline{0. Introduction.} \bigskip \bigskip \indent The study of nonequilibrium statistical mechanics leads naturally to the introduction of {\it nonequilibrium states}. These are probability measures $\mu$ on the phase space of the system, suitably chosen and stationary (in principle) under the nonequilibrium time evolution. In the present paper we analyze the entropy production $e(\mu)$ for such nonequilibrium states, and show that it is positive (more precisely $\ge0$, sometimes one can prove $>0$). That the positivity of $e(\mu)$ needs a proof was repeatedly pointed out by G. Gallavotti and E.G.D. Cohen\footnote{*}{In their seminal paper [15] for instance they state: "positivity rests on numerical evidence", and refer to [12]. } . Here we shall emphasize the physics of the problem and be particularly concerned with a proper choice of mathematical framework and definitions; the proof that $e(\mu) \ge0$ will then be relatively easy. \medskip\indent {\it Thermostatting} \medskip\indent We shall think of a physical system having a finite (possibly large) number of degrees of freedom. The phase space ${\cal S}$ is thus a finite dimensional manifold, with a symplectic structure and therefore a natural volume element. In the situation of {\it equilibrium statistical mechanics} there are conservative forces acting on the system. A Hamiltonian $H$ is thus defined on ${\cal S}$, and energy is conserved, {\it i.e.}, time evolution is restricted to an energy shell ${\cal S}_E=\{x:H(x)=E\}$ of ${\cal S}$, where $E$ typically ranges from some lower bound $E_0$ to $+\infty$ (this is because the potential energy is $\ge E_0$ and the kinetic energy takes all values $\ge 0$). While ${\cal S}$ is noncompact and has infinite volume, ${\cal S}_E$ is compact and has finite volume. \medskip\indent In the case of {\it nonequilibrium statistical mechanics} we have nonconservative forces and, although we may be able to define a natural energy function, with values in $[E_0,+\infty)$, the energy is in general not conserved. Typically the point representing the system wanders away to infinity in the noncompact phase space ${\cal S}$ while the system heats up, {\it i.e.}, its energy tends to $+\infty$. In this situation it is not possible to define time averages corresponding to a probability measure on ${\cal S}$, {\it i.e.}, it is not possible to introduce nonequilibrium states. This difficulty follows from the noncompactness of phase space and is not tied to the special physical meaning of the energy. (The same difficulty arises in diffusion problems where the energy is constant but the configuration space is infinite). \medskip\indent Physically, the way to avoid heating up the system is to put it in contact with a thermostat. One can idealize the thermostat as a random interaction (with a "heat bath"). The study of entropy production remains to be done in this framework, and should separate the randomness (or entropy) introduced by the thermostat, and that created by the system itself. \medskip\indent It is also possible to constrain the time evolution by brute force to some compact manifold $M \subset {\cal S}$. Consider for instance a system satisfying Hamiltonian equations of motion: $$ \dot X=J {\partial}_X H $$ where $X=(p,q)$ and $J{\partial}_X=(-{\partial}_q,{\partial}_p)$. The energy is conserved because $\dot H={\partial}_X H.{\dot X}={\partial}_X H.J{\partial}_X H=0$. Let us add an external driving term F so that the time evolution is now $$ \dot X=J{\partial}_X H+F(X) $$ This will in general heat up the system because $$ \dot H=\partial_X H.\dot X=\partial_X H.F(X) \ne 0 $$ but if we replace $F$ by $$ F-{(\partial H.F) \over (\partial H.\partial H)}.\partial H $$ then $H$ is preserved: this is the so-called {\it Gaussian thermostat} (see Hoover[18]). \medskip\indent To summarize, we want to act on our system to keep it outside of equilibrium, but also impose a thermostat to prevent heating. Physically this means that we pump entropy out of the system, while keeping the energy fixed. \medskip\indent From now on we shall consider a time evolution on a compact manifold $M$. We shall forget the symplectic structure (this is no longer relevant because we no longer have a Hamiltonian). We shall however need the volume element $dx$ to define the statistical mechanical entropy $S(\underline\rho)=-\int dx \underline\rho (x) \hbox{log}\underline\rho (x)$ of a probability density $\underline\rho$ on $M$. Equivalent volume elements will be equivalent for our purposes because changing $dx$ to $\phi(x)dx$ replaces $S(\underline\rho)$ by $S(\underline\rho)+\int dx \underline\rho (x) \hbox{log}\phi (x)$; the additive term is bounded independently of $\rho$, and will play no significant role in our considerations. We may thus take for $dx$ the volume element associated with any Riemann metric. Note that $S(\underline\rho)$ is the physical entropy when $\underline\rho$ is a thermodynamic equilibrium state, but we can extend the definition to arbitrary $\underline\rho$ such that $S(\underline\rho)$ is finite. \medskip\indent The fact that we take seriously the expression $S(\underline\rho)=-\int dx\underline\rho(x)\hbox{log}\underline\rho(x)$ for the entropy seems to be at variance with the point of view defended by Lebowitz [20] who prefers to give physical meaning to a {\it Bolzmann entropy} different from $S(\underline\rho)$. There is however no necessary contradiction between the two points of view, which correspond to idealizations of different physical situations. Specifically, Lebowitz discusses the entropy of states which are locally close to equilibrium, while here we analyze entropy production for certain particular steady states (which may be far from equilibrium). \medskip\indent {\it Pumping entropy out of the system.} \medskip\indent We have now reduced our mathematical framework to a smooth time evolution on a compact manifold $M$. We may also discretize the time (using a time one map $f$ or a Poincar\'e first return map $f$) and consider that the time evolution is given by iterates of $f:M \to M$. Even though the mathematical setup is now just that of a smooth dynamical system $(M,f)$, there remains the problem to study how entropy is pumped out of the system, and how nonequilibrium states are defined. We shall consider three cases. \medskip\indent (i) $f$ is a diffeomorphism (hence $f^{-1}$ is defined). Nonequilibrium states $\mu$ may be defined by time averages corresponding to orbits $(f^k x)$ where $x \in {\cal V}(\mu)$ and ${\cal V}$ has positive Riemann volume: vol${\cal V}(\mu)>0$. More precisely, let $\delta(x)$ denote the unit mass at $x$; we may say that $\mu$ is a nonequilibrium state if $\mu=\lim_{m \to \infty} (1/m) \sum_{k=0}^{m-1} \delta (f^k x)$ for all $x \in {\cal V}(\mu)$, and ${\cal V}(\mu)>0$; special examples are the so-called SRB states. We shall see that entropy is pumped out of $\mu$ because $f$ contracts volume elements (in the average)\footnote{*}{See Chernov, Eying, Lebowitz, and Sinai [5],Chernov and Lebowitz [6] for models with phase space contraction. }. \medskip\indent (ii) $f$ is an noninvertible map. Here the folding of the phase space caused by $f$ acts to pump entropy out of the system\footnote{**}{A model with folding of phase space has been considered by Chernov and Lebowitz [7].}. Nonequilibrium states may be defined as limits of states $(1/m)\sum_{k=0}^{m-1} f^k \rho$ with $\rho$ absolutely continuous with respect to the volume. \medskip\indent (iii) $f$ has a nonattracting set $A$ which carries a nonequilibrium state $\mu$ associated with a diffusion process \footnote{\dag}{See Gaspard and Nicolis [17], Dorfman and Gaspard [10].}. Specifically, let $A$ be an $f$-invariant subset of $M$ which is not attracting. If $U$ is a small neighborhood of $A$, $fU$ is not contained in $U$. Let $\rho$ be the Riemann volume normalized to $U$. Then $f\rho$ is not supported in $U$. We multiply by the characteristic function of $U$ and normalize to obtain a new probability measure ${\rho}_1={\parallel\chi_U .f\rho\parallel}^{-1} \chi_U.f\rho$. Iterating this process $m$ times we obtain $\rho_m$, and define $$ \rho^{(m)}={1\over m}\sum_{k=1}^m f^{-k} \rho_m $$ In the Axiom A case we shall see (Section 3 below) that $\rho^{(m)}$ tends to an $f$-invariant probability measure $\mu$ giving to $h(\mu)-\sum$ positive Lyapunov exponents of $\mu$ its maximum value $P$ ($h$ is the {\it Kolmogorov-Sinai entropy} and the {\it pressure} $P$ is $\le 0$). One can argue (see [19], [11], and below) that the volume of the points $x\in U$ such that $fx,...,f^m x \in U$ behaves like $e^{mP}$. Here again entropy is pumped out of the system by getting rid of the part of $f\rho$ outside of $U$, and $\mu$ may be interpreted as nonequilibrium state. \medskip\indent For a recent, physically oriented, review of nonequilibrium statistical mechanics we refer the reader to Dorfman [9]. He discusses in particular calculations using periodic orbits, as advocated by Cvitanovi\'c (see Artuso, Aurel, and Cvitanovi\'c [1], Cvitanovi\'c, Eckmann, and Gaspard [8]). \medskip\indent {\it Towards physical applications} \medskip\indent The {\it ergodic hypothesis} states that the Liouville measure restricted to an energy shell ${\cal S}_E$ (for a Hamiltonian system) is ergodic under time evolution. This serves to justify the ensembles of statistical mechanics and, while the ergodic hypothesis is likely to be false in general, it is apparently almost true in the sense that the application of equilibrium statistical mechanics to real systems has been extremely successful. \medskip\indent One may try to base nonequilibrium statistical mechanics on a principle similar to the ergodic hypothesis. Here one assumes that the dynamical system $(M,f)$ describing time evolution is {\it hyperbolic} in some sense \footnote{*}{See Eckmann and Ruelle [11] for definitions and a physically oriented review of dynamical systems.} , and that time averages are given by particular probability measures called SRB measures; these are the nonequilibrium states which replace the microcanonical {\it ensemble} of equilibrium statistical mechanics. The SRB states correspond to time averages for a set of positive measure of initial conditions. They are characterized by smoothness along unstable directions or equivalently by a variational principle \footnote{**}{The approach just indicated to the study of nonequilibrium systems was advocated early in lectures by the present author (G. Gallavotti mentions the date of 1973); for the case of turbulence see [24]. Other people familiar with SRB measures would have had similar ideas, but these have started to be useful only with the recent (1995) work of Gallavotti and Cohen [15], [16]. The mathematical study of SRB states was made by Sinai [26] for Anosov diffeomorphisms, Ruelle [23] for Axiom A diffeomorphisms, Bowen and Ruelle [4] for Axiom A flows. The very nontrivial extension to nonuniformly hyperbolic systems is due to Ledrappier and Young [22].}. \medskip\indent The assumption that the systems of nonequilibrium statistical mechanics are hyperbolic and described by SRB measures is unlikely to be exactly true, but it is reasonable to expect that it is approximately true in the sense that it gives correct physical predictions in the limit of large systems (thermodynamic limit). \medskip\indent Actual physical predictions were obtained only after Gallavotti and Cohen [15] supplemented the hyperbolicity assumption by the {\it reversibility} assumption. The latter assumes that there is a map $i:M\mapsto M$ such that $i^2=1$,$fi=if^{-1}$. \medskip\indent The {\it chaotic hypothesis} of Gallavotti and Cohen [15], [16] (see also Gallavotti [13],[14]) states that physically correct results (for nonequilibrium systems in the thermodynamic limit) will be obtained by assuming reversibility and treating the system as if it were hyperbolic (in fact Anosov). An essential role in the inspiration of Gallavotti and Cohen was played by the numerical results and analysis by Evans, Cohen and Morris [12]. \medskip\indent {\it Example.} \medskip\indent Consider a Hamiltonian $H(X)={1\over 2}(p,M^{-1}p)+U(q)$ where $M$ is the mass matrix, and $U$ is the potential energy. We denote by $f^t X$ (with $f^0 X=X$) the solution of Hamilton's equation $\dot X=J{\partial}_X H$. Defining $i(p,q)=(-p,q)$ we find that $f^t i=if^{-t}$, which expresses reversibility. Reversibility is preserved if we introduce an external force $F=(\Phi (q),0)$, and again if we add a Gaussian thermostat. \medskip\indent {\it Scope of the paper.} \medskip\indent In what follows we shall analyze entropy production and its positivity for the three cases outlined earlier: (i) diffeomorphism, (ii) noninvertible map, (iii) map near a nonattracting set. The treatment of these three cases will be somewhat uneven because the existing mathematical results range from detailed in case (i) to rather limited in case (iii). Since the emphasis of this paper is on having the physics straight we have allowed the uneven mathematical treatment, but suggested some conjectural extensions of the results that are proved. The possibility of a unified presentation will depend on further progress in the ergodic theory of differentiable dynamical systems. \medskip\indent {\it Acknowledgements.} \medskip\indent I am greatly indebted to Giovanni Gallavotti and Joel Lebowitz for discussion of the basic concepts and issues involved in the present paper, and to Jean-Pierre Eckmann for critical reading of the manuscript. \vfill\eject \centerline{1. Entropy production for diffeomorphisms.} \bigskip \bigskip Let $M$ be a compact manifold, and $f:M\to M$ a $C^1$ diffeomorphism. Choosing a Riemann metric on $M$, let $\rho(dx)=\underline\rho(x)dx$ be a probability measure with density $\underline\rho$ with respect to the Riemann volume element $dx$. The direct image $\rho_1 =f\rho$ has density $\underline\rho_1 (x) =\underline\rho(f^{-1}x) / J(f^{-1}x) $ where $J(X)$ is the absolute value of the Jacobian of $f$ at $x$ (computed with respect to the Riemann metric). The statistical mechanical entropy associated with $\underline\rho$ is $$ S(\underline\rho)=-\int dx \underline\rho(x)\log\underline\rho(x) $$ (This means that $dx$ is interpreted as the phase space volume element; if $dx$ is the configuration space volume element, then $S(\underline\rho)$ is the configurational entropy). The entropy $S(\underline\rho)$ will have to be distinguished from the Kolmogorov-Sinai (time) entropy $h(\mu )$ of an $f$-invariant state $\mu$ used below. The entropy associated with $\underline\rho_1$ is $$ S(\underline\rho_1)=-\int dx \underline\rho_1 (y)\log \underline\rho_1 (y) $$ $$=-\int dy {{\underline\rho(f^{-1} y)}\over{J(f^{-1} y)}} [\log\underline\rho(f^{-1}y)-\log J(f^{-1}y)] $$ $$=-\int dx \underline\rho(x)[\log \underline\rho(x)-\log J(x)]. $$ \medskip\indent The entropy put into the system in one time step is thus $$ S(\underline\rho_1)-S(\underline\rho)=\int dx \underline\rho(x)\log J(x) $$ This means that the entropy pumped out of the system, or produced by the system, is $$ -\int dx \underline\rho(x)\log J(x) $$ Let ${\underline\rho}_m$ be the density of the measure $\rho_m=f^m \rho$. If $\underline\rho_m$ tends vaguely \footnote{*}{The vague topology is the $w^*$-topology on the space of measures considered as dual of the space of continuous functions. We denote a vague limit by v.lim.} to $\mu$ when $m\to\infty$, the entropy production $$-[S({\underline\rho}_{m+1})-S({\underline\rho}_m)]=-\int dx \rho_m(x)\log J(x) $$ tends to $$ -\int \mu(dx)\log J(x) $$ It is thus natural to take as definition of the entropy production for an arbitrary $f$-invariant probability measure $\mu$ the expression $$ e_f(\mu)=-\int \mu(dx)\log J(x) $$ \indent In the rest of this Section we take $\mu$ to be ergodic, so that the Lyapunov exponents are constant ($\mu$-a.e.). The general case is obtained by representing $\mu$ as an integral over its ergodic components. \medskip\indent {\it 1.1 Lemma.} {\it The entropy production $e_f(\mu)$ is independent of the choice of Riemann metric and equal to minus the sum of the Lyapunov exponents of $\mu$ with respect to $f$.} \medskip\indent This follows from the Oseledec multiplicative ergodic theorem in the form given in [11].\qed \medskip\indent We remind the reader that the Kolmogorov-Sinai entropy $h(\mu)$ is the amount of information produced by $f$ in the state $\mu$ (see for instance Billingsley [2]). We always have $$ h(\mu)\le\sum\hbox{ positive Lyapunov exponents } \eqno(1.1) $$ (this inequality is due to Ruelle, see [11]). We call $\mu$ an SRB measure (see Ledrappier-Young [22], Eckmann-Ruelle [11]) if $$ h(\mu)=\sum\hbox{ positive Lyapunov exponents } \eqno(1.2) $$ ({\it Pesin identity}). If $f$ is of class $C^2$, the above condition is equivalent to $\mu$ having conditional probabilities on unstable manifolds absolutely continuous with respect to Lebesgue measure (Ledrappier-Young [22]). If $f$ is $C^2$ and $\mu$ has no vanishing Lyapunov exponent, then there is a set of positive Riemann volume of points $x\in M$ with time averages $(1/N)\sum_{k=0}^{N-1} \delta({f^k}x)$ tending vaguely to $\mu$ (this result is due to Pugh and Shub, see [11]). \medskip\indent {\it 1.2 Theorem.} {\it Let $f$ be a $C^1$ diffeomorphism and $\mu$ an $f$-invariant probability measure on the compact manifold $M$. \medskip\indent (a) If $\mu$ is an SRB measure then $e_f (\mu)\ge 0$. \medskip\indent (b) Let $f$ be $C^{1+\alpha}$ with $\alpha>0$ and $\mu$ be an SRB measure. If $\mu$ is singular with respect to $dx$ and has no vanishing Lyapunov exponent, then $e_f (\mu)>0$. \medskip\indent (c) For every $a$ $$ {\rm{vol}} \{x:{1\over m}\sum_{k=0}^{m-1} \log J(f^k x)\ge a\}\le e^{-ma} {\rm{vol}} M $$ In particular if ${\cal V(\mu)}=\{x:{\rm{v.lim}}_{m\to\infty} (1/m)\sum_{k=0}^{m-1}\delta(f^k x)=\mu\}$ and $e_f (\mu)<0$, then $\rm{vol}{\cal V}(\mu)=0$.} \medskip\indent We have denoted by vol the Riemann volume in $M$. In view of the result of Pugh and Shub mentioned above, (a) follows from (c) if $f$ is $C^2$ and $\mu$ has no zero characteristic exponent. Here is a direct proof of (a): if $\mu$ is SRB we have $$ e_f (\mu)=-\sum \hbox{ Lyapunov exponents } $$ $$=[h(\mu)-\sum \hbox{ positive Lyapunov exponents }]-[h(\mu)+\sum \hbox{ negative Lyapunov exponents }] $$ $$=[h(\mu]-\sum \hbox{ positive Lyap. exp. w.r.t. }f ]-[h(\mu)-\sum \hbox{positive Lyap. exp. w.r.t. }f^{-1} ] $$ $$ \ge 0 $$ where we have used (1.1) and (1.2). \medskip\indent To prove (b) notice that if $\mu$ is SRB and $e_f (\mu)=0$ then, according to (a), $$ h(\mu)=\sum \hbox{positive Lyapunov exponents}=-\sum \hbox{negative Lyapunov exponents} $$ This implies that $\mu$ is absolutely continuous with respect to $dx$ (see Ledrappier [21] Cor.(5.6)) if $f$ is of class $C^{1+\alpha}$ and $\mu$ has no vanishing Lyapunov exponent. \medskip\indent To prove (c) write $$ {\cal V}(m)=\{x:{1\over m}\sum_{k=0}^{m-1} \log J(f^k x) \ge a \} $$ We have thus $$ \hbox{vol} M \ge \hbox{vol} f^m {\cal V} (m)=\int_{{\cal V}(m)} \prod_{k=0}^{m-1} J(f^k x) dx $$ $$ \ge e^{ma} \hbox{vol} {\cal V} (m) $$ as announced.\qed \medskip\indent {\it 1.3 Corollary.} {\it If $\mu$ is an SRB measure with respect to both $f$ and $f^{-1}$, then $e_f(\mu)=0$. } \medskip\indent We have indeed $e_f(\mu)\ge0$, and $e_{f^{-1}}(\mu)=-e_f(\mu)\ge0$. (As pointed out to the author by Joel Lebowitz, this covers the case of the microcanonical ensemble).\qed \vfill\eject \centerline{2. Entropy production for noninvertible maps.} \bigskip \bigskip \indent {\it Standing assumptions.} \medskip\indent Let $M$ be a compact Riemann manifold, possibly with boundary. We denote by vol the Riemann volume and by $dx$ the volume element. We assume that a closed set $\Sigma\subset M$ is given, containing the boundary of $M$, and $f:M\backslash \Sigma \to M$ such that the following properties are satisfied \medskip\indent (A1) vol $\Sigma$=0 \medskip\indent (A2) There are disjoint open sets $D_1,...,D_N$ such that $M\backslash \Sigma ={\cup}_{\alpha =1}^N D_{\alpha}$, and $f\mid D_{\alpha}$ is a homeomorphism to $fD_\alpha$, absolutely continuous with respect to vol. The Jacobian $J$ of $f$ is continuous in $M\backslash \Sigma$ and satisfies $$ \hbox{inf}_{x\notin \Sigma}J(x)\ge e^{-K}>0 $$ (A3) For all pairs $(\alpha ,\beta)$, $fD_{\alpha}$ and $fD_{\beta}$ are either disjoint or identical. \medskip\indent {\it Comments.} \medskip\indent It is convenient to use a map $f$ defined outside of an {\it excluded} set $\Sigma$. In particular this allows discontinuities on $\Sigma$. When considering the direct image $f\mu$ of a measure $\mu \ge 0$ on $M$ by $f$, we shall have to assume that $\mu (\Sigma )=0$. (We have made such an assumption for the measure vol ). \medskip\indent Condition (A3) might seem very strong, but can be arranged to hold under the weaker assumption $$ \hbox{vol} (fD_{\alpha}\cap \partial fD_{\beta})=0 $$ for all pairs $(\alpha ,\beta )$. Let indeed $( D_{\gamma}^1 )$ be the family of open sets $\cap_{k=1}^N (fD_{\alpha})^\sim $ where $(fD_{\alpha})^\sim $ is either $fD_{\alpha}$ or $M\backslash \hbox{clos} fD_{\alpha}$ for each $\alpha$. Let $D_{\alpha \gamma}^*=D_\alpha \cap f^{-1} D_{\gamma}^1$ and $\Sigma^* =M\backslash \cup_\alpha \cup_\gamma D_{\alpha \gamma}^*$, then (A1), (A2), (A3) hold when $\Sigma$, $(D_\alpha)$ are replaced by $\Sigma^*$, $(D_{\alpha \gamma}^*)$. When considering the direct image $f\mu$, we shall now have to assume that $\mu(\Sigma^*)=0$. \medskip\indent {\it Refining $(D_{\alpha})$.} \medskip\indent Let $fD_{\alpha}=D_{\gamma}^1$. We may write $$ D_{\gamma}^1={\Sigma}_{\gamma}^1 \cup D_{\gamma 1}^1 \cup ... \cup D_{\gamma n}^1 $$ where vol ${\Sigma}_{\gamma}^1=0$ and the disjoint open sets $D_{\gamma 1}^1,...,D_{\gamma n}^1$ are small. Writing $D_{\alpha i}=D_{\alpha} \cap f^{-1} D_{\gamma i}^1$, we may replace $(D_\alpha)$ by a family $(D_{\alpha i})$ of arbitrarily small sets. In other words we may refine the family $(D_\alpha)$ to a new family $(D_{\alpha}^*)$ (with $\alpha \in \{1,...,N^* \}$ and an excluded set ${\Sigma}^*$) so that (A1),(A2), (A3) still hold and the sets $D_{\alpha}^*$ are arbitrarily small. \medskip\indent In the study of a measure $\mu \ge 0$ with $\mu(\Sigma)=0$ we can arrange that $f(\mu)({\Sigma}_{\gamma}^1)=0$, implying that $\mu ({\Sigma}^*) =0$. \vfill\eject \indent {\it Folding entropy. } \medskip\indent Let $\mu$ be a positive measure on $M \backslash \Sigma $. (We may also consider $\mu$ as a positive measure on $M$ such that $\mu(\Sigma)=0$). Our assumptions imply that there is a {\it disintegration} of $\mu$ associated with the map $f$ (see Bourbaki [3] par.3). In general this means that we have the integral representation $$ \mu=\int {\mu}_1 (dx) {\nu}_x $$ where ${\mu}_1=f\mu$ is the direct image of $\mu$ by $f$, and ${\nu}_x$ is a probability measure with ${\nu}_x (f^{-1}\{x\})=1$. This representation is essentially unique. Here we may assume that ${\nu}_x$ is atomic (with at most $N$ atoms) and write $$ H({\nu}_x)=-\sum_{\alpha} p_{\alpha} \hbox{log } p_{\alpha} $$ where the $p_{\alpha}$ are the masses of the atoms of ${\nu}_x$. (In the general case we would write $H({\nu}_x)=+\infty $ if $\nu_x$ is nonatomic). We let now $$ F(\mu)=F_f (\mu)=\int {\mu}_1 (dx) H({\nu}_x) $$ and call $F(\mu)$ the {\it folding entropy} of $\mu$ with respect to $f$. \medskip\indent Let again $D_{\gamma}^1=fD_{\alpha}$. By the concavity of $t \mapsto -t{\hbox {log}}t$, we have $$ ({\mu}_1 (D_{\gamma}^1))^{-1}\int_{D_{\gamma}^1} \mu_1 (dx) H({\nu_x}) \le -\sum_{\alpha : \gamma(\alpha)=\gamma} {{\mu(D_{\alpha})} \over {\mu_1 (D_{\gamma}^1)}} \hbox{log}{{\mu(D_{\alpha})} \over {\mu_1 (D_{\gamma}^1)}} $$ Therefore, when $(D_{\alpha})$ is replaced by $(D_{\alpha}^*)$ which consists of smaller and smaller sets, the expression $$ F^* (\mu)=\sum_{\gamma} \mu_1 (D_{\gamma}^{*1}) [-\sum_{\alpha :\gamma (\alpha) =\gamma} {{\mu(D_{\alpha}^*)} \over {\mu_1 (D_{\gamma}^{*1})}} \hbox{log}{{\mu(D_{\alpha}^*)} \over {\mu_1 (D_{\gamma}^{*1})}} ] $$ tends to $F(\mu)=\int \mu_1 (dx) H(\nu_x)$ from above. \medskip\indent {\it 2.1 Proposition. Let $P$ be the set of probability measures on $M$ with the vague topology and $$ I=\{ \mu \in P : \mu \hbox{ is } f-{\hbox{invariant}} \} $$ $$ P_{\backslash \Sigma}=\{ \mu \in P : \mu (\Sigma)=0 \} $$ $$ I_{\backslash \Sigma}=I \cap P_{\backslash \Sigma} $$ \indent (a) The function $F: P_{\backslash\Sigma} \mapsto {\bf R}$ (with values in $[0,\log N]$) is concave upper semicontinuous (u.s.c.). \medskip\indent (b) The restriction of $F$ to $I_{\backslash\Sigma}$ is affine u.s.c.} \medskip\indent Since $H(\nu_x)$ takes values in $[0,{\hbox{log}}N]$, so does $F$. To prove concavity we have to estimate $F$ at $\mu^\prime$, $\mu^{\prime \prime}$ and $\mu=(1-t)\mu^\prime +t \mu^{\prime \prime}$, with $\mu^\prime$, $\mu^{\prime \prime}\in P_{\backslash\Sigma}$. We may choose $(D_{\alpha}^*)$ arbitrarily fine so that $\mu^\prime (\Sigma^*)=\mu^{\prime\prime} (\Sigma^*)=0$; therefore $F(\mu)={\hbox{lim}}{F^*}(\mu)$, $F(\mu^\prime)={\hbox{lim}}{F^*}(\mu^\prime)$, $F(\mu^{\prime\prime})={\hbox{lim}}{F^*}(\mu^{\prime\prime})$. Concavity of $F$ follows from the concavity of $t \mapsto {F^*}((1-t)\mu^\prime +t\mu^{\prime\prime})$, or the convexity of $$ t \mapsto \sum_{\alpha} [(1-t)u_{\alpha}+tv_{\alpha}]{\hbox{log}}{{(1-t)u_{\alpha}+tv_{\alpha}} \over {\sum_{\beta} ((1-t)u_{\beta}+tv_{\beta})}} $$ \indent Since $P$ is metrizable (with the vague topology), we prove upper semicontinuity of $F$ by showing that if $\rho^{(m)}$, $\mu \in P_{\backslash \Sigma}$, and if the sequence $(\rho^{(m)})$ tends to $\mu$, then $F(\mu) \le \hbox{lim}F(\rho^{(m)})$. We may choose $(D_{\alpha}^*)$ arbitrarily fine so that $\mu(\Sigma^*)=0$ and $\rho^{(m)}(\Sigma^*)=0$ for all $m$; $F(\mu)$ and $F(\rho^{(m)})$ are thus limits of $F^*(\mu)$ and $F^*(\rho^{(m)})$. Since $\mu(\Sigma^*)=\rho^{(m)}(\Sigma^*)=0$, $F^*$ is continuous for the vague topology on the set $S=\{\mu\}\cup\{\rho^{(m)}:m \in {\bf N}\}$, and $F|S$ is thus the limit of a decreasing family of continuous functions, hence upper semicontinuous. This proves (a). \medskip\indent To prove (b) we remark that $\mu^\prime$, $\mu^{\prime\prime}$ are absolutely continuous with respect to $\mu=(1-t)\mu^\prime +t \mu^{\prime\prime}$ (if $t \ne 0,1$) and let $g^\prime={{\delta \mu^\prime} / {\delta \mu}}$, $g^{\prime\prime}={{\delta \mu^{\prime\prime}} / {\delta \mu}}$. If $\mu^\prime$,$\mu^{\prime\prime} \in I$ the functions $g^\prime$, $g^{\prime\prime}$ are $f$-invariant. Therefore $$ \mu^\prime (dy)=(g^\prime \mu)(dy)=g^\prime (y) \mu (dy)=\int \mu_1 (dx) g^\prime (y) \nu_x (dy) $$ $$ =\int \mu_1 (dx) g^\prime (x) \nu_x (dy)=\int (g^\prime \mu)_1 (dx) \nu_x (dy)=\int\mu_1^\prime (dx) \nu_x (dy) $$ and similarly for $g^{\prime\prime} \mu$. Therefore $$ (1-t)F(\mu^\prime)+tF(\mu^{\prime\prime})=(1-t)\int \mu_1^\prime (dx) H(\nu_x) +t \int \mu_1^{\prime\prime} (dx) H(\nu_x) =\int \mu_1 (dx) H(\nu_x) =F(\mu) $$ This completes the proof of the proposition.\qed \medskip\indent {\it Extension} \medskip\indent If $P$ denotes the set of positive measures on $M$ (rather than the probability measures), we have $$ F(\mu) \in [0,\hbox{log} N].\parallel\mu\parallel $$ Apart from that the above proposition remains true, with the same proof. In fact since $F(\lambda \mu)=\lambda F(\mu)$ for $\lambda \ge 0$, $\mu \ge 0$, $\mu(\Sigma)=0$, the extension of $F$ from probability measures to positive measures is trivial. \medskip\indent {\it Entropy production} \medskip\indent We define now the {\it entropy production} $e_f (\mu)$ for a dynamical system $(M, f)$ satisfying our standing assumptions and $\mu \in P_{\backslash\Sigma}$ ({\it i.e.}, $\mu$ is a probability measure such that $\mu (\Sigma)=0$). We write $$ e_f (\mu)=F(\mu)- \mu (\hbox{log} J) $$ This definition will be motivated below, first when $\mu$ is defined by a density, then more generally. \medskip\indent {\it 2.2 Proposition. \medskip\indent (a) $e_f (\mu)$ is independent of the choice of Riemann metric on $M$. \medskip\indent (b) $e_f$ is concave u.s.c. on $P_{\backslash \Sigma}$, and affine u.s.c. on $I_{\backslash \Sigma}$. \medskip\indent (c) If the probability measures $\rho^{(m)}$ are absolutely continuous with respect to Riemann volume, and tend vaguely to $\mu \in P_{\backslash \Sigma}$ we have} $$ {\hbox{lim sup}}_{m \to \infty} e_f (\rho^{(m)}) \le e_f(\mu) $$ \indent A change of Riemann metric replaces $J$ by $J + \Phi -\Phi \circ f$, so that $\mu (\hbox{log}J)$ and $e_f(\mu)$ are not changed. This proves (a). \medskip\indent The function $K+\log J$ is $\ge 0$ and continuous on $M \backslash \Sigma$. Let $(\chi_n)$ be an increasing sequence of continuous functions $M \mapsto [0,1]$, vanishing on $\Sigma$ and tending to 1 on $M \backslash \Sigma$. Then $((K+ \log J).\chi_n)$ is an increasing sequence of continuous positive functions tending to $K+\hbox{log}J$ on $M \backslash \Sigma$. Therefore $$ \mu \mapsto \mu (K+\hbox{log}J)=K+\mu(\log J) $$ is affine l.s.c. on $P_{\backslash \Sigma}$, and $$ \mu \mapsto -\mu (\hbox{log}J) $$ is affine u.s.c. on $P_{\backslash \Sigma}$. Together with Proposition 2.1 this proves (b). \medskip\indent To prove (c) we note that , since vol$\Sigma=0$, we have $\rho^{(m)}(\Sigma)=0$. It suffices then to apply (b).\qed \medskip\indent {\it Entropy associated with a density} \medskip\indent Let $\rho$ be a probability measure with density $\underline\rho$ with respect to Riemann volume , {\it i.e.}, $\rho (dx)={\underline\rho}(x) dx$. If $dx$ is interpreted as phase space volume element, the statistical mechanical entropy associated with $\rho$ is $$ S(\underline\rho)=- \int dx {\underline \rho}(x) \hbox{log}{\underline \rho}(x) $$ Using the concavity of the log we have $$ S(\underline\rho)=\int {\underline \rho}(x) \hbox{log}{1 \over {\underline \rho}(x)} \le \hbox{log} \int dx {{\underline \rho}(x) \over {\underline \rho}(x)} =\hbox{log vol}M \eqno(2.1) $$ so that $S(.)$ takes values in $[-\infty, \hbox{log vol}M]$, the value $-\infty$ being allowed. \medskip\indent If $\psi_\alpha$ is the inverse of $f|D_\alpha$, the direct image $\rho_1=f\rho$ has density $$ {\underline\rho}_1=\sum_\alpha ({\underline\rho}\circ \psi_\alpha).(\bar J \circ \psi_\alpha) $$ where $\bar J=1/J$, and characteristic functions of the sets $f D_\alpha$ have been omitted. Define $$ p_\alpha ={1 \over {\underline\rho}_1 (x)} \underline\rho (\psi_\alpha x). ({\bar J}\circ \psi_\alpha) $$ $$ \nu_x =\sum_\alpha p_\alpha (x) \delta (\psi_\alpha x) $$ where $\delta (x)$ denotes the unit mass at $x$. Note that $f\nu_x =\delta (x)$. We have the disintegration $$ \rho =\int dx {\underline\rho}_1 (x) \nu_x \eqno(2.2) $$ and therefore $$ F(\rho)=\int dx {\underline\rho}_1 (x) H(\nu_x) \eqno(2.3) $$ \medskip\indent Note also the identity \footnote {*}{We are applying the formula (familiar in equilibrium statistical mechanics) $$ \log\sum_i e^{-U(i)}=-\sum_i p_i \hbox{log}p_i -\sum_i p_i U(i) $$ with $p_i=e^{-U(i)} / \sum_j e^{-U(j)}$. } $$ \hbox{log}{\underline\rho}_1 (x) =-\sum_\alpha p_\alpha (x) \hbox{log} p_\alpha (x) +\sum_\alpha p_\alpha (x) [\hbox{log}{\underline\rho}(\psi_\alpha x)+\hbox{log} {\bar J} (\psi_\alpha x)] $$ $$ =H(\nu_x)+\nu_x (\hbox{log}{\underline\rho})+\nu_x (\hbox{log} \bar J) $$ Therefore, using (2.3) and (2.2), $$ -S({\underline\rho}_1)=\int dx {\underline\rho}_1(x) \hbox{log}{\underline\rho}_1(x) $$ $$ =\int dx {\underline\rho}_1(x) H(\nu_x) +\int dx {\underline\rho}_1(x) \nu_x(\hbox{log} \underline\rho ) +\int dx {\underline\rho}_1(x) \nu_x (\hbox{log} \bar J ) =F(\rho)+\rho(\hbox{log} \underline\rho ) +\rho(\hbox{log} \bar J ) $$ and, if $S(\rho)=-\rho(\hbox{log} \underline\rho)$ is $\ne -\infty$, $$ -[S({\underline\rho}_1)-S(\underline\rho)]=F(\rho)+\rho (\hbox{log}{\bar J}) \eqno(2.4) $$ The right hand side has values $\le \hbox{log}N+K$ so that $S({\underline\rho}_1) \ne -\infty$ when $S(\underline\rho) \ne -\infty$. \medskip\indent {\it 2.3 Proposition. Let $S(\underline\rho) \ne -\infty$. \medskip\indent (a) The entropy production associated with the density $\underline\rho$ is $$ -[S({\underline\rho}_1)-S(\underline\rho)]=F(\rho)+\rho(\hbox{log} \bar J)=e_f (\rho) $$ \indent (b) If the probability measures $\rho^{(m)}$ are absolutely continuous with respect to Riemann volume, and tend vaguely to $\mu$ such that $\mu(\Sigma)=0$, we have} $$ e_f(\mu) \ge \hbox{lim sup}_{m \to \infty} [-S({\underline\rho}_1^{(m)})+S({\underline\rho}^{(m)})] $$ \indent (a) follows from (2.4); (b) follows from (a) and Proposition 2.2(c).\qed \medskip\indent {\it Physical discussion. } \medskip\indent The above proposition is our justification to define $e_f (\mu)$ as the entropy production associated with $\mu \in P_{\backslash \Sigma}$. Note that the definition of $e_f (\mu)$ depends only on $\mu$ and $f$ and not on the choice of an approximation of $\mu$ by absolutely continuous measures ${\mu}^{(m)}$. However we only have the inequality $$ e_f (\mu)=F(\mu)+\mu (\hbox{log} \bar J) \ge {\hbox {lim sup}}_{m \to \infty} [F({\rho}^{(m)} )+{\rho}^{(m)} ({\hbox{log}}\bar J)] $$ where one might hope for an inequality. The term $\mu ({\hbox {log}}\bar J)$ poses no serious problem in this respect: if we assume that $\hbox{log}J$ is bounded we have $$ \mu({\hbox{log}}\bar J)={\hbox{lim}}_{m \to \infty} {\rho}^{(m)} (\hbox{log}\bar J) $$ For the term $F(\mu)$ there might however be a discontinuity of $F$ at $\mu$. What this means is that some mass of $\rho ^{(m)}$ gets "folded more" in the limit $\rho ^{(m)} \to \mu$. For instance $f$ might be injective on supp$\rho ^{(m)}$ but not on supp$\mu$; this would give $F(\rho ^{(m)})=0$, but possibly $F(\mu) >0$. \medskip\indent Physically one should only think of $\mu$ as an idealization of $\rho ^{(m)}$ for large $m$. When the map $f$ "folds together" some mass of $\mu$ it almost folds together the corresponding mass of $\rho ^{(m)}$ and, in a coarse-grained description, it thus makes sense to replace $F(\rho ^{(m)} )$ by $F(\mu)$ and to interpret the latter as the physical folding entropy of our system. \medskip\indent Take ${\rho}^{(m)}=f^m \rho$ and suppose that ${\rho}^{(m)} \to \mu \in I_{\backslash \Sigma}$. In the step between time $m$ and time $m+1$, the entropy production is $$ -S({\underline\rho}^{(m+1)})+S({\underline \rho}^{(m)})=F({\rho}^{(m)})+{\rho}^{(m)}(\hbox{log}\bar J) $$ which we approximate by $F(\mu)+\mu({\hbox{log}}\bar J)$. This seems to mean that $S({\underline\mu})$ increases by a fixed amount at each time step, which is absurd since $\mu$ does not depend on time. In fact, typically, $\mu$ is singular, i.e., its density $\underline\mu$ does not exist, and we should write $S(\underline\mu)=-\infty$. We shall argue later that the entropy production is positive; the system produces this entropy by having its own entropy $S({\underline\rho}^{(m)})$ decrease towards $-\infty$ when $m \to \infty$. The entropy produced is absorbed (or transfered to the outside world) by the time evolution $f$ (i.e., by the forces which cause the time evolution). \medskip\indent Let $\rho \mapsto \rho * \theta$ denote the action of a stochastic diffusion operator $\theta$ close to the identity operator. Let us replace the time evolution $\rho \mapsto f\rho$ by the 'noisy evolution' $\rho \mapsto (f\rho)*\theta$. We assume that this stochastic evolution has a steady state $\mu_{\theta}$ tending to $\mu$ when $\theta \to \hbox{identity}$. Here $S({\underline\mu}_\theta)$ is finite and we can see that the entropy production is due to the diffusion $*\theta$. We may indeed write $$ -S({\underline\mu}^\prime)+S(\underline\mu)=S({\underline\mu}^{\prime\prime})-S({\underline\mu}^\prime) $$ where $\underline\mu$,${\underline\mu}^\prime$,${\underline\mu}^{\prime\prime}$ are the densities associated respectively with ${\mu}_{\theta}$,$f{\mu}_{\theta}$, and $(f{\mu}_{\theta})*{\theta}={\mu}_{\theta}$. The left hand side in the above formula is our familiar expression for the entropy production, and the right hand side is the entropy produced by the diffusion. Let ${\mu}^{(m)}$ be obtained from $\rho$ by the noisy evolution after $m$ time steps. Because ${\mu}^{(m)}$ is smeared as compared with ${\rho}^{(m)}={f^m}\rho$, we expect that the folding entropy $F({\mu}^{(m)})$ will be close to $F({\mu}_\theta)$ or $F(\mu)$. This is further justification for our choice of the definition $e_f (\mu)$ for the entropy production. \medskip\indent {\it Positivity of entropy production.} \medskip\indent The following result, showing that $e_f (\mu) \ge 0$ for physically reasonable $\mu$, is close to the results obtained when $f$ is a diffeomorphism. The proof is remarkably simple. \medskip\indent {\it 2.4 Theorem. Let $\rho$ be a probability measure with density $\underline\rho$ on $M$. If $S(\underline\rho)$ is finite and if $\mu$ is a vague limit of the measures $\rho^{(m)}=(1/m)\sum_{k=0}^{m-1} f^k \rho$ when $m \to \infty$, then $e_f (\mu) \ge 0$.} \medskip\indent By proposition 2.2(c) and 2.2(b) respectively we have $$ e_f(\mu) \ge \hbox{lim sup}_{m \to \infty} e_f (\rho^{(m)} ) $$ $$ e_f(\rho^{(m)}) \ge {1 \over m}\sum_{k=0}^{m-1} e_f(f^k\rho) $$ \indent Using Proposition 2.3(a), we also have $$ \sum_{k=0}^{m-1} e_f (f^k \rho) =-S({\underline\rho}_m)+S(\underline\rho) $$ Therefore $$ e_f(\mu)\ge\hbox{lim sup}{1\over m}[-S({\underline\rho}_m)+S(\underline\rho)] $$ Since $-S({\underline\rho}_m)\ge -\hbox{log vol}M$ (by (2.1) above) we obtain $e_f(\mu) \ge0$.\qed \medskip\indent {\it Alternate approach.} \medskip\indent Instead of our standing assumption, let us suppose that $M$ is a compact manifold and $f:M \to M$ a $C^1$ map. One may then conjecture that $$ h(\mu)\le F(\mu)+|\sum \hbox{negative Lyapunov exponents}| $$ when $\mu$ is an $f$-ergodic probability measure. (If our standing assumptions hold and $f$ is piecewise $C^1$, with $\mu(\Sigma)=0$, this can be proved along the lines of Ruelle [25]). For a SRB state $\mu$ we have $$ h(\rho)=\sum\hbox{positive Lyapunov exponents} $$ and our conjecture implies $$ e_f (\mu)=F(\mu)-\sum\hbox{positive Lyap. exp.}+|\sum\hbox{negative Lyap. exp.}| $$ $$ \ge h(\mu)-\sum\hbox{positive Lyap. exp.}=0 $$ \vfill\eject \centerline{3. Entropy production associated with diffusion.} \bigskip \bigskip \indent Let $M$ be a compact manifold, $f:M \to M$ a diffeomorphism, and $A$ a compact $f$-invariant subset of $M$. Given a small open neighborhood $U$ of $A$, we define $$ U_m =\{ x:f^k x \in U \hbox{ for }k=0,...,m\} $$ Since we do not assume that the set $A$ is attracting, mass will in general leak out of $U$, {\it i.e.}, vol $U_m \to 0$ when $m \to \infty$. It is conjectured (see Kantz and Grassberger [19], Eckmann and Ruelle [11], Gaspard and Nicolis [17]), that in many cases vol $U_m \approx e^{mP}$, and the escape rate from $A$ under $f$ is (up to change of sign) $$ P=P_{Af}={\sup}_{\rho\in\partial I_A}\{h(\rho)-\sum\hbox{positive Lyap. exp. for } (\rho ,f)\}\le 0 $$ where $\partial I_A$ is the set of $f$-ergodic probability measures with support in $A$. \medskip\indent If $\chi_m$ is the characteristic function of $U_m$, let $\rho_{[m]}$ and ${\underline\rho}_{[m]}^*$ be given by $$ \rho_{[m]}(dx)={{\chi_m (x)} \over {\hbox{vol}U_m}} dx $$ $$ (f^m \rho_{[m]})(dx)={\underline\rho}_{[m]}^* (x) dx $$ Then we may define the entropy production associated with escape from $A$ as $$ e_A=\lim_{m\to\infty} {1\over m} [S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]}^*)] \eqno(3.1)$$ if this limit exists. \medskip\indent The next proposition deals with the Axiom A case \footnote{*}{Smale's foundational article [27] is still a convenient introduction to hyperbolic dynamical systems (with the definition of Axiom A diffeomorphisms, basic sets,etc.). For further references see [11].} , which is well understood mathematically. One may conjecture that results obtained in that case hold much more generally, but proofs are lacking at this time. \medskip\indent {\it 3.1 Proposition. Let $A$ be a basic set for the $C^2$ Axiom A diffeomorphism $f$, and $U_m$, $P$,$\rho_{[m]}$, ${\underline\rho}_{[m]}^*$ be as above} \medskip\indent (a) $\hbox{lim}_{m\to\infty} {1\over m}\hbox{log vol }U_m=P$ \medskip\indent (b) {\it There is a unique $f$-ergodic probability measure $\mu$ on $A$ such that} $$ h(\mu)-\sum\hbox{positive Lyap. exp. for }(\mu,f)=P $$ {\it \medskip\indent (c) Define} $$ \rho^{(m)} ={1\over m} \sum_{k=0}^{m-1} f^k \rho_{[m]} $$ {\it then} $\hbox{v-lim }\rho^{(m)} =\mu$ {\it when $m\to\infty$. \medskip\indent (d) The limit (3.1) defining $e_A$ exists, and $$ e_A=\lim_{m\to\infty} {1\over m}[S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]}^*)]=-P_{Af}-\mu(\log J) $$ (where $J$ is the absolute value of the Jacobian of $f$).} \medskip\indent (a) can readily be extracted from Bowen and Ruelle [4] where a slightly weaker result is proved (and flows are considered instead of diffeomorphisms). \medskip\indent If $J^u$ denotes the Jacobian in the unstable direction log $J^u$ is H\"older continuous on $A$, and since $f|A$ is topologically transitive, there is a unique equilibrium state $\mu$ maximizing $h(\mu)-\mu(\log J^u)$ (see Ruelle [23]). This proves (b). \medskip\indent The {\it volume lemma} of [4] establishes a close relation between $\rho_m$ and $\mu$. In fact it follows from [4] that any vague limit of $\rho^{(m)}$ when $m\to\infty$ is absolutely continuous with respect to $\mu$. Such a limit is also $f$-invariant and, since $\mu$ is ergodic, equal to $\mu$. This proves (c). \medskip\indent Since ${\underline\rho}_{[m]}(x)={{\chi_m}(x)/ {\hbox{vol }U_m}}$, we have $$ S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]}) =\hbox{log vol }U_m-\hbox{log vol }U_0 $$ and (a) yields $$ \lim_{m\to\infty} {1\over m} [S({\underline\rho}_{[0]})-S({\underline\rho}_{[m]})]=-P_{Af} $$ We also have $$ S({\underline\rho}_{[m]})-S({\underline\rho}_{[m]}^*) =-\int dx {\underline\rho}_m (x) \log \prod_{k=0}^{m-1} J(f^k x) =-m \int \rho^{(m)} (dx) \log J(x) $$ hence, using (c), $$ \lim_{m\to\infty} {1\over m}[S({\underline\rho}_{[m]})-S({\underline\rho}_{[m]}^*)] =-\mu (\log J) $$ and (d) follows.\qed \medskip\indent In conclusion the entropy production $e_A$ associated with escape from the Axiom A basic set $A$ under $f$ is $$ e_{Af}(\mu)=-P_{Af}-\mu(\hbox{log} J) $$ This may be taken as a {\it definition} of $e_{Af}(\mu)$ for all $\mu\in I_A$, when $A$ is an $f$-invariant set, $f$ is not necessarily an Axiom A diffeomorphism, and $I_A$ is the set of $f$-invariant probability measures with support in $A$. Notice that $e_{Af}(\mu)\ne e_f(\mu)$ unless $P_{Af}=0$; this corresponds to the fact that $e_{Af}$ and $e_f$ describe different processes of entropy production (they coincide if $A$ is an attracting set). It is readily seen that $e_{Af}(\mu)$ is independent of the choice of Riemann metric. Here again we shall prove positivity of the entropy production. \medskip\indent {\it 3.2 Proposition. Let $\mu\in\partial I_A$ satisfy the following extension of the Pesin identity} $$ h(\mu)-\sum\hbox{positive Lyapunov exponents}=P_{Af} $$ {\it We have then $$ e_{Af}(\mu)\ge -P_{Af^{-1}} \ge 0 \eqno(3.2) $$} \medskip\indent We have indeed $$ e_{Af}(\mu)=-h(\mu)+\sum\hbox{positive Lyap. exp. for }(\mu,f)-\sum\hbox{Lyap. exp. for }(\mu,f) $$ $$ =-h(\mu)-\sum\hbox{negative Lyap. exp. for }(\mu,f) $$ $$ =-[h_{f^{-1}}(\mu)-\sum\hbox{positive Lyap. exp. for }(\mu,f^{-1})] $$ $$ \ge -P_{Af^{-1}} $$ and (3.2) follows from $P_{Af}\le 0$.\qed \medskip\indent {\it 3.3 Remarks.} \medskip\indent (a) Proposition 3.2 holds without restriction, but the interpretation of $e_{Af}(\mu)$ as entropy production and of $|P_{Af}|$ as escape rate are guaranteed only in the Axiom A case. For more general situations such interpretations remain conjectural. \medskip\indent (b) In the Axiom A case $e_{Af}(\mu)=0$ implies that $P_{Af^{-1}}=0$, {\it i.e.}, $A$ is an attractor for $f^{-1}$, and $\mu$ is the corresponding SRB measure on $A$. \vfill\eject References \bigskip [1] R.Artuso, E.Aurell and P.Cvitanovi\'c. "Recycling of strange sets: I. Cycle expansions, II. Applications." Nonlinearity {\bf 3},325-359,361-386(1990). [2] P.Billingsley. {\it Ergodic theory and information.} John Wiley, New York, 1965. [3] N.Bourbaki. {\it El\'ements de math\'ematiques.} {\bf VI.} {\it Integration.} Ch.6 {\it Int\'egration vectorielle}. Hermann, Paris, 1959. [4] R.Bowen and D.Ruelle. "The ergodic theory of Axiom A flows." Invent. Math. {\bf 29},181-202(1975). [5] N.I.Chernov, G.L.Eyink, J.L.Lebowitz, and Ya.G.Sinai. "Steady-state electrical conduction in the periodic Lorentz gas." Commun. Math. Phys. {\bf154},569-601(1993). [6] N.I.Chernov and J.L.Lebowitz. "Stationary shear flow in boundary driven Hamiltonian systems." Phys. Rev. Letters, {\bf 75},2831-2834(1995). [7] N.I.Chernov and J.L.Lebowitz. In preparation. [8] P.Cvitanovi\'c, J.-P.Eckmann and P.Gaspard. "Transport properties of the Lorentz gas in terms of periodic orbits." Chaos, Solitons and Fractals {\bf 4}(1995). [9] J.R.Dorfman. {\it From molecular chaos to dynamical chaos}. Lecture notes. Maryland,1995. [10] J.R.Dorfman and P.Gaspard. "Chaotic scattering theory of transport and reaction rate coefficients." Phys. Rev. {\bf E51},28 (1995). [11] J.-P.Eckmann and D.Ruelle. "Ergodic theory of strange attractors." Rev. Mod. Phys. {\bf 57},617-656(1985). [12] D.J.Evans, E.G.D.Cohen and G.P.Morriss. "Probability of second law violations in shearing steady flows." Phys. Rev. Letters {\bf 71},2401-2404(1993). [13] G.Gallavotti. "Reversible Anosov diffeomorphisms and large deviations." Math. Phys. Electronic J. {\bf 1},1-12(1995). [14] G.Gallavotti. "Chaotic hypothesis: Onsager reciprocity and fluctuation- dissipation theorem." Preprint. [15] G.Gallavotti and E.G.D.Cohen. "Dynamical ensembles in nonequilibrium statistical mechanics." Phys. Rev. Letters {\bf 74},2694-2697(1995). [16] G.Gallavotti and E.G.D.Cohen. "Dynamical ensembles in stationary states." J. Statist. Phys. {\bf 80},931-970(1995). [17] P.Gaspard and G.Nicolis. "Transport properties, Lyapunov exponents, and entropy per unit time." Phys. Rev. Letters {\bf 65},1693-1696(1990). [18] W.G.Hoover. {\it Molecular dynamics}. Lecture Notes in Physics {\bf 258}. Springer, Heidelberg, 1986. [19] H.Kantz and P.Grassberger. "Repellers, semi-attractors, and long-lived chaotic transients." Physica {\bf 17D},75-86(1985). [20] J.L.Lebowitz. "Boltzmann's entropy and time's arrow." Physics Today {\bf 46},No 9,32-38(1993). [21] F.Ledrappier. "Propri\'et\'es ergodiques des mesures de Sinai." Publ. math. IHES {\bf 59},163-188(1984). [22] F.Ledrappier and L.S.Young. "The metric entropy of diffeomorphisms: I. Characterization of measures satisfying Pesin's formula, II. Relations between entropy, exponents and dimension." Ann. of Math. {\bf 122},509-539,540-574(1985). [23] D.Ruelle. "A measure associated with Axiom A attractors." Am. J. Math. {\bf 98},619-654(1976). [24] D.Ruelle. "Sensitive dependence on initial condition and turbulent behavior of dynamical systems." Ann. N.Y. Acad. Sci. {\bf 316},408-416(1978) [25] D.Ruelle. "An inequality for the entropy of differentiable maps." Bol. Soc. Bras. Mat. {\bf 9},83-87(1978). [26] Ya.G.Sinai. "Gibbs measures in ergodic theory." Usp. Mat. Nauk {\bf 27},No 4,21-64 (1972) [Russian Math. Surveys {\bf 27},No 4,21-69(1972)]. [27] S.Smale. "Differentiable dynamical systems." Bull. Amer. Math. Soc. {\bf 73},747-817(1967). \vfill\eject \bye