**IAN STEWART:
ENTROPY VS. DISORDER AND GRAVITICS VS. THERMODYNAMICS**

** **

*By Mark Perakh*

Posted on November 11, 2004

** **

Ian Stewart is a distinguished British mathematician and
writer. This essay is a brief commentary on certain aspects of Ian Stewart’s
article titled “The Second Law of Gravitics and the Fourth Law of
Thermodynamics” which is found in the collection *From Complexity to Life*
edited by Niels Henrik Gregersen. [1] In two essays published previously I
commented on the articles by Paul Davies [2] and Charles H. Bennett [3] in the
same collection, so this essay is the third installment in the planned review of
the entire collection (or at least of most of its chapters).

Overall, Ian Stewart’s paper is thought-provoking and contains many points I agree with. I have, however, a few reservations regarding some of Stewart’s notions.

One such notion concerns the relationship between entropy and “disorder.” For example, Stewart renders his thesis as follows (page 122):

“… order/disorder metaphor is fraught with tacit assumptions (that are often wrong) and is inconsistent, ill defined, and by no mean unique.”

He also writes (page 144):

“Thermodynamic entropy is metaphorically identified with disorder. However, the assumptions involved in such terminology are exceedingly dubious.”

On pages 122-123, Stewart discusses an example of gas that expands from one corner of a container to fill the entire available volume, and writes that the conventional interpretation of that process concludes (page 123) that “loss of order corresponds to gain of entropy, so entropy equals disorder.” However, in Stewart’s view such an interpretation is far from being universal. As I understand Stewart’s notions, in his view the distinction between “order” and “disorder” has no rigorous definition but only an intuitive understanding which is one of the reasons that the definition of entropy as a measure of disorder is not well-substantiated.

To my mind, this Stewart’s conclusion is not quite convincing.

Indeed, if there is no rigorous and commonly
accepted definition of “disorder” except for an intuitive understanding (as
Stewart seems to think) then whatever definition of entropy were chosen, it
couldn’t contradict the *non-existent* definition of disorder. If this were
the case, what would prevent us from choosing entropy to be a measure of
“disorder” by carefully defining the connection between the two concepts?
Adopting entropy as a measure of disorder is not prohibited by any known laws
of physics or theorems of information theory.

In fact, however, there are definitions of
disorder. Therefore, in order to decide whether defining entropy as a measure of
disorder is substantiated, we have to see if such a definition is compatible
with the *existing* definitions of disorder.

In my view, the definition of entropy as a measure of disorder can quite well fit some of the existing concepts of disorder. In particular, a consistent and fruitful definition of disorder is given in the algorithmic theory of complexity/information by Solomonoff, Kolmogorov, and Chaitin (SKC), whose seminal concepts have been explained, for example, in the elegant article by Gregory Chaitin reprinted in the same collection.[4]

The SKC theory ties together complexity, randomness, and information (although “Kolmogorov information” differs from Shannon information [5]). It is easiest to apply it to informational systems, e.g. to texts.

Let us represent the system under investigation as a text. A sufficiently complete description of the system in plain words and/or mathematical formulae can serve as the source of such a text. This description can always be transcribed in binary symbols, i.e. as a string of zeroes and ones. (A simple example is representing a text, written in any conventional alphabet, by Morse alphabet, wherein each dot represents a 0, and each dash a 1).

In terms of algorithmic information theory, a
*truly random* (i.e. disordered) text cannot be *compressed*
preserving the amount of information it carries. In terms of classical
information theory a random text has zero redundancy. Redundancy is a
single-valued function of the text’s Shannon’s entropy (the latter measured in
bits per character):

R=1- S_{act}/ S_{max }

_{ }

_{ }where R is redundancy (a dimensionless
quantity, 0≤R<1), S_{max} is the maximum possible specific entropy of a
text (i.e. the entropy per character of a perfectly random text whose redundancy
is R=0) which equals S_{max}=log_{2} N. N is the number of
symbols available in the alphabet. S_{act }is the specific entropy of
the actual specimen of text which, unlike the perfectly random text, possesses a
certain redundancy R>0.

To my mind, the definition of entropy as a measure of disorder (i.e. of degree of randomness) is fully compatible with the concept of disorder (randomness) as defined in the algorithmic theory of information.

Is the same true for thermodynamic entropy? First, recall that Shannon’s formula for text’s entropy (also referred to as Shannon’s uncertainty) is the same as Boltzmann-Gibbs’s formula for thermodynamic entropy except for a constant. This formula is

.

K is the constant which in Shannon’s version is
dimensionless and taken to be K=1 while in Gibbs-Boltzmann’s version K is the
Boltzmann coefficient measured in Joule/Kelvin. Also, in Gibbs-Boltzmann’s
version, natural logarithms are traditionally used while in Shannon’s version it
most often is a logarithm to the base of 2, which results in measuring H in *
bits per character* (although natural logarithms are occasionally used as
well in which case instead of *bits* the units are called *nats *or*
nits*). Finally, in Shannon’s version *p _{i}*

The mentioned differences between Shannon’s and thermodynamic entropies do not affect the principal interpretation of entropy as a measure of disorder. (In statistical physics [6] entropy is treated as an essentially dimensionless quantity, thus erasing its most salient difference from Shannon’s uncertainty).

In order to interpret thermodynamic entropy as
a measure of disorder, it is useful to review the behavior of “parameters” of
the system. Generally speaking, each system is affected by an immense number of
parameters, but only a fraction of them exert a substantial influence on the
system’s behavior. Choosing the set of a system’s *essential* parameters
means selecting those few parameters which are assumed to principally affect the
system’s behavior, and ignoring a vast number of other parameters assumed to
play an insignificant role in that behavior. The selected “crucial” parameters
must be given a quantitative representation. For example, for a system which is
a certain amount of ideal gas of mass *m,* the essential parameters can be
reasonably limited to only three – pressure, volume, temperature – P,V,T. Then
the system’s degree of order may be defined as the degree to which the
parameters have *gradients*.

In a perfectly “disordered” (randomized) system
there are no gradients of the appropriate parameters. Of course order is a
matter of degree. A system can be more or less ordered (or disordered). There
can never be certainty that a system is perfectly disordered (random). However,
it is principally feasible to assert that a system is *not* random – to
this end it is sufficient to point to any non-zero gradients of its parameters
(for example, for the ideal gas of mass *m*, the gradients in question may
be such as |dP/dx|>0, |dV/dx|>0, |dT/dx|>0, |dP/dy|>0, |dV/dy|>0, |dT/dy|>0, |dP/dz|>0,
|dV/dz|>0, and/or |dT/dz|>0, etc.

I believe this approach in no way contradicts Solomonoff-Kolmogorov-Chaitin’s concepts of randomness and complexity.

In view of the above, I submit that it is possible to assert that the thermodynamic entropy (treated statistically) is essentially the same concept as informational entropy (i.e. as Shannon’s uncertainty) and can reasonably be chosen as a measure of disorder as well.

Now turn to another point in Stewart’s paper.
On page 142 we read, “In information theory (Shannon & Weaver 1964) there is a
quantity called *entropy *that is the negative of information.”

Since I don’t have at hand Shannon & Weaver’s
book of 1964, I’ll refer to Shannon’s original work of 1948, titled *A
Mathematical Theory of Communication*.[7] If we look it up we’ll discover
that entropy, in Shannon’s original interpretation, could hardly be consistently
interpreted as the negative of information. In fact, Shannon had defined entropy
as the *average information* per character in a string, and it is this
definition which reflects the contents of the above formula for H. I don’t see
how the quantity which is the average information can simultaneously be
construed a negative of information as a general interpretation.

Perhaps the confusion may stem from the view of
entropy as a property of a system. In fact the only consistent interpretation of
entropy is viewing it as a *measure of disorder* but not as a *property*
of either the system or of its components. As a *measure*, entropy of, say,
gas, is not a property of the molecules but only a measure of the degree of
randomness of their distribution over the container’s volume (and/or of the
distribution of molecules’ momenta). Likewise, Shannon entropy of a string is
not a property of characters of which the string is composed but the measure of
randomness of the characters’ distribution along the string.

A measuring tape can be used to measure, say,
my height. However, nobody would think that the tape *is* my height, or
that the tape is a part of my organism. Likewise, entropy can be used to *
measure* the loss of information but it does not make it the negative of
information as a definition. Entropy (i.e., average information in Shannon’s
conceptual system) can be used to measure the decrease of uncertainty at the

receiving end of a communication chain but it does not make it the negative of information either. It can equally be used to measure the amount of information carried out from the source of a sent message (whereas it is not necessarily accompanied by the loss of information at the source; sending information out may not cause a loss of information at the source). Entropy and information have no “plus” or “minus” sign inherently attached – they can be construed either positive or negative depending on the vantage point. What is negative for John may be positive for Mary.

Indeed, continuing his discourse Stewart denies the quoted interpretation of Shannon’s entropy as the negative information. For example, on page 143 we read,

“…in information theory, the information ‘in’ a message is not negative information-theoretic entropy. Indeed, the entropy of the source remains unchanged, no matter how many messages it generates .”

I agree with that statement. Note that it is at odds with the concepts affirmed in the papers by Paul Davies [8, 9] in the same collection, as is discussed in my review of Davies’s first paper [2]. (I intend to write up a discussion of Davies’ second paper [9] wherein, in particular, to argue against Davies’s thesis regarding entropy as being a negative of information; I plan to specifically discuss there Davies’s example of information’s and entropy’s changes in the process of achieving equilibrium between portions of a gas that initially had different temperatures.)

I also believe that accepting the notion of
entropy as a *measure* rather than a *property* makes attempts to
define a meaningful “law of conservation of information” hopeless – such a law
would be meaningless as information is not a commodity which can be conserved,
and the above quotation from page 143 of Stewart’s article is in agreement with
my statement. (Another contributor to the collection in question, William
Dembski, has been promoting for several years his version of the “law of the
conservation of information” which he claims to be the “4^{th} law of
thermodynamics.” Besides the general fault of such attempts because conservation
of information is a concept void of a meaningful interpretation, the alleged
law, in Dembski‘s formulation, contradicts the 2^{nd} law of
thermodynamics, as discussed elsewhere [10]).

Back to the role of thermodynamic entropy as a measure of disorder, let us note that entropy of texts (including molecular chains such as DNA and the like) which is a reflection of the degree of text’s incompressibility in Kolmogorov-Chaitin sense, on the other hand can be naturally tied to gradients of appropriate parameters.

For example, as was shown [11], semantically
meaningful texts, regardless of language, authorship, etc., display what we
called Letter Serial Correlation (LSC) which is absent in gibberish texts. In
particular, LSC in semantically meaningful strings results in the existence of
what we called the Average Domain of Minimal Letter Variability which is not
found in gibberish strings. Along the semantically meaningful string there is a
gradient of a parameter which we called Letter Serial Correlation sum, S_{m.
}No such gradients are observed in “disordered” texts, i.e. in randomized
gibberish, where S_{m} displays haphazard variations along the string.
In highly ordered but meaningless strings of symbols gradients of S_{m}
often display wave-like regular variations along the string. In semantically
meaningful strings gradients of S_{m} are also variable along the text,
but the variations of S_{m }along the string display a peculiar, easily
recognizable shape, *qualitatively identical* for all meaningful texts
regardless of language but differing quantitatively depending on the language
and specific texts.

Accordingly, the entropy of texts is the
highest for chaotic gibberish “texts” where S_{m }varies haphazardly so
no gradients emerge (except for accidental local mini-gradients). It is minimal
in highly ordered meaningless texts where gradient of S_{m} varies along
the string in a regular wave-like manner. For semantically meaningful texts
entropy has intermediate values. Therefore it also seems reasonable to define
entropy as a measure of disorder for texts from the standpoint of gradients.
(There seem to be a certain limitation to this interpretation, pertaining to
ideally ordered meaningless strings; it is, however, due to the peculiar choice
of S_{m }as the parameter in question rather than to the overall gist of
the argument. This limitation can be removed by choosing, instead of S_{m},
another, slightly different parameter; this point will not be further discussed
here because of its extraneous character for this discourse).

For thermodynamic systems, to consistently apply the above definition of entropy as a measure of disorder we have to look at the behavior of appropriate thermodynamic parameters as well, and if we discover gradients, we define the system as having a degree of order. If a process results in steeper gradients, either constant or variable, entropy decreases (as well as the degree of disorder).

The 2^{nd} law of thermodynamics
requires increase of entropy only in closed systems, i.e. as a result of *
spontaneous* processes only. If a process is imposed on a system, entropy can
behave in different ways and the 2^{nd} law does not prohibit its
decrease (while also not asserting its inevitability).

Stewart considers an example of gas gathering in a corner of a container. The gathering of gas molecules in a corner of a container, while not entirely impossible, is highly unlikely to occur spontaneously. We conclude that a spontaneous process most likely results in gas spreading all over the container’s volume, and is accompanied by entropy increase and a reduction (up to the almost complete elimination) of gradients of density, temperature, and pressure (except for local fluctuations). Nothing prohibits interpreting this as an increase of disorder, and such a definition does not contradict the definition of entropy as a measure of disorder. If, though, molecules are driven into a corner by external force, entropy of gas decreases and we may legitimately assert that the order within the system increases as well; such an assertion will be in tune with the interpretation of disorder both from the standpoint of Kolmogorov-Chaitin’s incompressibility of a string representing the system and from the standpoint of gradients as measures of order.

Stewart suggests that a Fourth law of
thermodynamics may indeed exist, but in his view it is not the putative law of
conservation of information (which, I believe, belongs in crank science) but
rather a more plausible “2^{nd} law of gravitics.”

I enjoyed Stewart’s discourse where he
suggested that putative new law, although I have some reservations. One very
minor reproach I could offer is about Stewart’s reference (page 116) to
“standard three laws of thermodynamics.” It is an odd reference because there
are not three but four “standard” laws of thermodynamics (the zeroth, the first,
the second, and the third). An additional law of thermodynamics, if ever
established, would indeed be named the 4^{th} law, but not because there
are only three such laws so far, but because of the peculiar way the existing
four laws have been historically named.

The “2^{nd} law of gravitics” which,
according to Stewart, is so far just a “place-holder” for the future law
supplementing the existing laws of thermodynamics and which is expected to
reside outside of thermodynamics, is formulated by Stewart (page 146):

* *

*As time passes (forwards according to the thermodynamic arrow) gravitic
systems tend to become more and more clumped.*

* *

To my mind, the above statement sounds more like an
observation, i.e., a statement of fact than like what usually is given the
status of a law of science. Of course, in a broad sense any statement of facts
may be claimed a law, and to my mind this is a legitimate usage of terms. Say,
“at a pressure of about 10^{5} Pascal, ice melts at 273.16 Kelvin” is a
statement of fact derived from observation and may legitimately be viewed as a
scientific “law.” However, if we survey the conventional practice, the honorific
title of “law” is more often reserved for more profound statements, especially
if such a statement is offered as something on a par with the 2^{nd} law
of thermodynamics. This is, of course, a minor point, more so because Stewart
adds:

“Observations support this law – but what causes the clumping?”

Stewart suggests that clumping results from
“scaling symmetry” of the law of gravitation. This topic is beyond the scope of
my commentary, so all I can say that I tend to accept the notions about “scaling
symmetry” and “oligarchic growth” [12] of clusters of matter, but all this, in
my view, is yet insufficient to justify introduction of a new and very profound
“law” which would complement the four known laws of thermodynamics. While
Stewart’s ideas regarding the possible law serving as a substitute for the 4^{th}
law of thermodynamics are, in my view, of interest and deserve further analysis,
so far I remain unconvinced that the above “2^{nd} law of gravitics” in
that formulation has indeed accomplished the task. Of course I have no objection
to viewing it as a “place holder” for a possible future law.

REFERENCES

- Ian Stewart. “The Second Law of Gravitics and the Fourth
Law of Thermodynamics.” In Niels Henrik Gregersen, editor,
*From Complexity to Life: On the Emergence of Life and Meaning.*NY: Oxford University Press, 2003: 114-150. - Mark Perakh. “Paul Davies: Emergentist vs. Reductionist.”
*Talk Reason*, www.talkreason.org/articles/davies.cfm or http://members.cox.net/perakm/davies.htm. Posted on September 25, 2004. - Mark Perakh. “Defining Compexity.” Talk Reason, www.talkreason.org/articles/complexity.pdf or http://members.cox.net/perakm/complexity.pdf. Posted on August 12, 2004.
- Gregory J. Chaitin. “Randomness and a Mathematical
Proof.” In Niels Henrik Gregersen, editor,
*From Complexity to Life: On the Emergence of Life and Meaning*. NY: Oxford University Press, 2003: 19-33. - Peter Grunwald & Paul Vitányi, "Shannon Information and
Kolmogorov

Complexity" http://www.arxiv.org/abs/cs.IT/0410002 , accessed on October 5, 2004. - Lev D. Landau and Evgenyi M. Lifshits,
*Statisticheskaya Fizika*(Statistical Physics, in Russian; while the reference here is to the Russian original, an English translation is available). Moscow, Nauka publishers, 1964: 38-44. - Claude E. Shannon. “A Mathematical Theory of
Communication,” parts 1 and 2,
*Bell System Technology Journal*, (July 1948}: 379-90; (October 1948): 623-37. - Paul Davies, “Introduction: Toward an Emergentist
Worldview.” In Niels Henrik Gregersen, editor,
*From Complexity to Life: On the Emergence of Life and Meaning*. NY: Oxford University Press, 2003: 3-16. - Paul Davies. “Complexity and the Arrow of Time.” In
Niels Henrik Gregersen, ed.,
*From Complexity to Life*, NY: Oxford University Press, 2003: 74. - Mark Perakh.
*Unintelligent Design*. Prometheus Books, 2004 (chapter 1: 88-90). - Mark Perakh and Brendan McKay. “Letter Serial
Correlation.” In
*Mark Perakh’s website*. http://members.cox.net/marperak/Texts/; accessed on October 24, 2004. - E. Kokubo and S. Ida. “Oligarchic Growth of Protoplanets.”
*Icarus*, 127 (1990):171-178 (as referred by Stewart in the article reviewed here; reference not verified).

* *