*Posted November 12, 2004*

Ian Stewart is a distinguished British mathematician and
writer. This essay is a brief commentary on certain aspects of Ian Stewart's
article titled "The Second Law of Gravitics and the Fourth Law of
Thermodynamics" which is found in the
collection *From Complexity to Life* edited by Niels Henrik Gregersen. [1]
In two essays published previously I commented on the articles by Paul Davies [2]
and Charles H. Bennett [3] in the same collection, so this essay is the third
installment in the planned review of the entire collection (or at least of most
of its chapters).

Overall, Ian Stewart's paper is thought-provoking and contains many points I agree with. I have, however, a few reservations regarding some of Stewart's notions.

One such notion concerns the relationship between entropy and "disorder." For example, Stewart renders his thesis as follows (page 122):

"... order/disorder metaphor is fraught with tacit assumptions (that are often wrong) and is inconsistent, ill defined, and by no mean unique."

He also writes (page 144):

"Thermodynamic entropy is metaphorically identified with disorder. However, the assumptions involved in such terminology are exceedingly dubious."

On pages 122-123, Stewart discusses an example of gas that expands from one corner of a container to fill the entire available volume, and writes that the conventional interpretation of that process concludes (page 123) that "loss of order corresponds to gain of entropy, so entropy equals disorder." However, in Stewart's view such an interpretation is far from being universal. As I understand Stewart's notions, in his view the distinction between "order" and "disorder" has no rigorous definition but only an intuitive understanding which is one of the reasons that the definition of entropy as a measure of disorder is not well-substantiated.

To my mind, this Stewart's conclusion is not quite convincing.

Indeed, if
there is no rigorous and commonly accepted definition of "disorder" except for
an intuitive understanding (as Stewart seems to think) then whatever definition
of entropy were chosen, it couldn't contradict the *non-existent*
definition of disorder. If this were the case, what would prevent us from choosing
entropy to be a measure of "disorder" by carefully defining the connection
between the two concepts? Adopting entropy
as a measure of disorder is not prohibited by any known laws of physics or theorems
of information theory.

In fact,
however, there are definitions of disorder. Therefore, in order to decide whether
defining entropy as a measure of disorder is substantiated, we have to see if
such a definition is compatible with the *existing* definitions of
disorder.

In my view, the definition of entropy as a measure of disorder can quite well fit some of the existing concepts of disorder. In particular, a consistent and fruitful definition of disorder is given in the algorithmic theory of complexity/information by Solomonoff, Kolmogorov, and Chaitin (SKC), whose seminal concepts have been explained, for example, in the elegant article by Gregory Chaitin reprinted in the same collection.[4]

The SKC theory ties together complexity, randomness, and information (although "Kolmogorov information" differs from Shannon information [5]). It is easiest to apply it to informational systems, e.g. to texts.

Let us represent the system under investigation as a text. A sufficiently complete description of the system in plain words and/or mathematical formulae can serve as the source of such a text. This description can always be transcribed in binary symbols, i.e. as a string of zeroes and ones. (A simple example is representing a text, written in any conventional alphabet, by Morse alphabet, wherein each dot represents a 0, and each dash a 1).

In terms of
algorithmic information theory, a *truly random* (i.e. disordered) text
cannot be *compressed* preserving the amount of information it carries. In
terms of classical information theory a random text has zero redundancy.
Redundancy is a single-valued function of the text's Shannon's entropy (the
latter measured in bits per character):

R=1- S_{act}/
S_{max }

where R
is redundancy (a dimensionless quantity, 0≤R<1), S_{max} is
the maximum possible specific entropy of a text (i.e. the entropy per character
of a perfectly random text whose redundancy is R=0) which equals S_{max}=log_{2}
N. N is the number of symbols available in the alphabet. S_{act }is the
specific entropy of the actual specimen of text which, unlike the perfectly
random text, possesses a certain redundancy R>0.

To my mind, the definition of entropy as a measure of disorder (i.e. of degree of randomness) is fully compatible with the concept of disorder (randomness) as defined in the algorithmic theory of information.

Is the same true for thermodynamic entropy? First, recall that Shannon's formula for text's entropy (also referred to as Shannon's uncertainty) is the same as Boltzmann-Gibbs's formula for thermodynamic entropy except for a constant. This formula is

_{}.

K is the constant which in Shannon's version is
dimensionless and taken to be K=1 while in Gibbs-Boltzmann's version K is the
Boltzmann coefficient measured in Joule/Kelvin. Also, in Gibbs-Boltzmann's
version, natural logarithms are traditionally used while in Shannon's version
it most often is a logarithm to the base of 2, which results in measuring H in *bits
per character* (although natural logarithms are occasionally used as well in
which case instead of *bits* the units are called *nats *or* nits*).
Finally, in Shannon's version *p _{i}*is probability of an "event" (usually the event is the occurrence of a
given character at a specific location in a string of characters), while in
Gibbs-Boltzmann's version

The mentioned differences between Shannon's and thermodynamic entropies do not affect the principal interpretation of entropy as a measure of disorder. (In statistical physics [6] entropy is treated as an essentially dimensionless quantity, thus erasing its most salient difference from Shannon's uncertainty).

In order to interpret thermodynamic
entropy as a measure of disorder, it is useful to review the behavior of
"parameters" of the system. Generally speaking, each system is affected by an
immense number of parameters, but only a fraction of them exert a substantial
influence on the system's behavior. Choosing the set of a system's *essential*
parameters means selecting those few parameters which are assumed to
principally affect the system's behavior, and ignoring a vast number of other
parameters assumed to play an insignificant role in that behavior. The selected
"crucial" parameters must be given a quantitative representation. For example,
for a system which is a certain amount of ideal gas of mass *m,* the
essential parameters can be reasonably limited to only three – pressure,
volume, temperature – P,V,T. Then the system's degree of order may be defined
as the degree to which the parameters have *gradients*.

In a
perfectly "disordered" (randomized) system there are no gradients of the
appropriate parameters. Of course order is a matter of degree. A system can be
more or less ordered (or disordered).
There can never be certainty that a system is perfectly disordered
(random). However, it is principally feasible to assert that a system is *not*
random – to this end it is sufficient to point to any non-zero gradients of its
parameters (for example, for the ideal gas of mass *m*, the gradients in
question may be such as |dP/dx|>0, |dV/dx|>0, |dT/dx|>0, |dP/dy|>0,
|dV/dy|>0, |dT/dy|>0, |dP/dz|>0, |dV/dz|>0, and/or |dT/dz|>0,
etc.

I believe this approach in no way contradicts Solomonoff-Kolmogorov-Chaitin's concepts of randomness and complexity.

In view of the above, I submit that it is possible to assert that the thermodynamic entropy (treated statistically) is essentially the same concept as informational entropy (i.e. as Shannon's uncertainty) and can reasonably be chosen as a measure of disorder as well.

Now turn to
another point in Stewart's paper. On page 142 we read, "In information theory
(Shannon & Weaver 1964) there is a quantity called *entropy *that is
the negative of information."

Since I
don't have at hand Shannon & Weaver's book of 1964, I'll refer to Shannon's
original work of 1948, titled *A Mathematical Theory of Communication*.[7]
If we look it up we'll discover that entropy, in Shannon's original interpretation,
could hardly be consistently interpreted as the negative of information. In
fact, Shannon had defined entropy as the *average information* per
character in a string, and it is this definition which reflects the contents of
the above formula for H. I don't see how the quantity which is the average
information can simultaneously be construed a negative of information as a
general interpretation.

Perhaps the
confusion may stem from the view of entropy as a property of a system. In fact
the only consistent interpretation of entropy is viewing it as a *measure of
disorder* but not as a *property* of either the system or of its
components. As a *measure*, entropy of, say, gas, is not a property of the
molecules but only a measure of the degree of randomness of their distribution
over the container's volume (and/or of the distribution of molecules'
momenta). Likewise, Shannon entropy of
a string is not a property of characters of which the string is composed but
the measure of randomness of the characters' distribution along the string.

A measuring
tape can be used to measure, say, my height. However, nobody would think that
the tape *is* my height, or that the tape is a part of my organism.
Likewise, entropy can be used to *measure* the loss of information but it
does not make it the negative of information as a definition. Entropy (i.e., average
information in Shannon's conceptual system) can be used to measure the decrease
of uncertainty at the receiving end of a communication chain but it does not make
it the negative of information either. It can equally be used to measure the
amount of information carried out from the source of a sent message (whereas it
is not necessarily accompanied by the loss of information at the source;
sending information out may not cause a loss of information at the source). Entropy
and information have no "plus" or "minus" sign inherently attached – they can
be construed either positive or negative depending on the vantage point. What
is negative for John may be positive for Mary.

Indeed, continuing his discourse Stewart denies the quoted interpretation of Shannon's entropy as the negative information. For example, on page 143 we read,

"...in information theory, the information 'in' a message is not negative information-theoretic entropy. Indeed, the entropy of the source remains unchanged, no matter how many messages it generates ."

I agree with that statement. Note that it is at odds with the concepts affirmed in the papers by Paul Davies [8, 9] in the same collection, as is discussed in my review of Davies's first paper [2]. (I intend to write up a discussion of Davies' second paper [9] wherein, in particular, to argue against Davies's thesis regarding entropy as being a negative of information; I plan to specifically discuss there Davies's example of information's and entropy's changes in the process of achieving equilibrium between portions of a gas that initially had different temperatures.)

I also believe that accepting the notion of
entropy as a *measure* rather than a *property* makes attempts to
define a meaningful "law of conservation of information" hopeless – such a law
would be meaningless as information is not a commodity which can be conserved,
and the above quotation from page 143 of Stewart's article is in agreement with
my statement. (Another contributor to the collection in question, William
Dembski, has been promoting for several years his version of the "law of the
conservation of information" which he claims to be the "4th law of
thermodynamics." Besides the general fault of such attempts because
conservation of information is a concept void of a meaningful interpretation, the
alleged law, in Dembski's formulation,
contradicts the 2nd law of thermodynamics, as discussed
elsewhere [10]).

Back to the role of thermodynamic entropy as a measure of disorder, let us note that entropy of texts (including molecular chains such as DNA and the like) which is a reflection of the degree of text's incompressibility in Kolmogorov-Chaitin sense, on the other hand can be naturally tied to gradients of appropriate parameters.

For
example, as was shown [11], semantically meaningful texts, regardless of
language, authorship, etc., display what we called Letter Serial Correlation
(LSC) which is absent in gibberish texts. In particular, LSC in semantically
meaningful strings results in the existence of what we called the Average Domain
of Minimal Letter Variability which is not found in gibberish strings. Along the semantically meaningful string
there is a gradient of a parameter which we called Letter Serial Correlation
sum, S_{m. }No such gradients are observed in "disordered" texts, i.e.
in randomized gibberish, where S_{m} displays haphazard variations
along the string. In highly ordered but meaningless strings of symbols gradients
of S_{m} often display wave-like regular variations along the string.
In semantically meaningful strings gradients of S_{m} are also variable
along the text, but the variations of S_{m
}along the string display a peculiar, easily recognizable shape, *qualitatively
identical* for all meaningful texts regardless of language but differing
quantitatively depending on the language and specific texts.

Accordingly,
the entropy of texts is the highest for chaotic gibberish "texts" where S_{m
}varies haphazardly so no gradients emerge (except for accidental local
mini-gradients). It is minimal in highly ordered meaningless texts where
gradient of S_{m} varies along the string in a regular wave-like
manner. For semantically meaningful texts entropy has intermediate values.
Therefore it also seems reasonable to define entropy as a measure of disorder
for texts from the standpoint of gradients. (There seem to be a certain
limitation to this interpretation, pertaining to ideally ordered meaningless
strings; it is, however, due to the peculiar choice of S_{m }as the
parameter in question rather than to the overall gist of the argument. This
limitation can be removed by choosing, instead of S_{m}, another,
slightly different parameter; this point will not be further discussed here
because of its extraneous character for this discourse).

For thermodynamic systems, to consistently apply the above definition of entropy as a measure of disorder we have to look at the behavior of appropriate thermodynamic parameters as well, and if we discover gradients, we define the system as having a degree of order. If a process results in steeper gradients, whether constant or variable, entropy decreases (as well as the degree of disorder).

The 2nd
law of thermodynamics requires increase of entropy only in closed systems, i.e.
as a result of *spontaneous* processes only. If a process is imposed on a
system, entropy can behave in different ways and the 2nd law does
not prohibit its decrease (while also not asserting its inevitability).

Stewart considers an example of gas gathering in a corner of a container. The gathering of gas molecules in a corner of a container, while not entirely impossible, is highly unlikely to occur spontaneously. We conclude that a spontaneous process most likely results in gas spreading all over the container's volume, and is accompanied by entropy increase and a reduction (up to the almost complete elimination) of gradients of density, temperature, and pressure (except for local fluctuations). Nothing prohibits interpreting this as an increase of disorder, and such a definition does not contradict the definition of entropy as a measure of disorder. If, though, molecules are driven into a corner by external force, entropy of gas decreases and we may legitimately assert that the order within the system increases as well; such an assertion will be in tune with the interpretation of disorder both from the standpoint of Kolmogorov-Chaitin's incompressibility of a string representing the system and from the standpoint of gradients as measures of order.

Stewart suggests that a Fourth law of thermodynamics may indeed exist, but in his view it is not the putative law of conservation of information (which, I believe, belongs in crank science) but rather a more plausible "2nd law of gravitics."

I enjoyed Stewart's discourse where he suggested that putative new law, although I have some reservations. One very minor reproach I could offer is about Stewart's reference (page 116) to "standard three laws of thermodynamics." It is an odd reference because there are not three but four "standard" laws of thermodynamics (the zeroth, the first, the second, and the third). An additional law of thermodynamics, if ever established, would indeed be named the 4th law, but not because there are only three such laws so far, but because of the peculiar way the existing four laws have been historically named.

The "2nd law of gravitics" which, according to Stewart, is so far just a "place-holder" for the future law supplementing the existing laws of thermodynamics and which is expected to reside outside of thermodynamics, is formulated by Stewart (page 146):

As time passes (forwards according to the thermodynamic arrow) gravitic systems tend to become more and more clumped.

To my mind, the above statement sounds more like an
observation, i.e., a statement of fact than like what usually is given the
status of a law of science. Of course, in a broad sense any statement of facts
may be claimed a law, and to my mind this is a legitimate usage of terms. Say,
"at a pressure of about 10^{5} Pascal, ice melts at 273.16 Kelvin" is a
statement of fact derived from observation and may legitimately be viewed as a
scientific "law." However, if we survey the conventional practice, the
honorific title of "law" is more often reserved for more profound statements,
especially if such a statement is offered as something on a par with the 2nd
law of thermodynamics. This is, of
course, a minor point, more so because Stewart adds:

"Observations support this law – but what causes the clumping?"

Stewart
suggests that clumping results from "scaling symmetry" of the law of
gravitation. This topic is beyond the scope of my commentary, so all I can say
that I tend to accept the notions about "scaling symmetry" and "oligarchic
growth" [12] of clusters of matter, but all this, in my view, is yet
insufficient to justify introduction of a new and very profound "law" which
would complement the four known laws of thermodynamics. While Stewart's ideas
regarding the possible law serving as a substitute for the 4^{th} law
of thermodynamics are, in my view, of interest and deserve further analysis, so
far I remain unconvinced that the above "2nd law of gravitics" in
that formulation has indeed accomplished the task. Of course I have no
objection to viewing it as a "place holder" for a possible future law.

[1]
Ian Stewart. "The Second Law of Gravitics and the Fourth Law of
Thermodynamics." In Niels Henrik Gregersen, editor, *From Complexity to
Life: On the Emergence of Life and Meaning.* NY: Oxford University
Press, 2003: 114-150.

[2] Mark
Perakh. "Paul Davies: Emergentist vs. Reductionist." *Talk Reason*, www.talkreason.org/articles/Davies1.cfm
or http://members.cox.net/perakm/davies.htm. Posted on September 25, 2004.

[3] Mark Perakh. "Defining Compexity." Talk Reason, www.talkreason.org/articles/complexity.pdf or http://members.cox.net/perakm/complexity_ben.htm. Posted on August 12, 2004.

[4] Gregory
J. Chaitin. "Randomness and a Mathematical Proof." In Niels Henrik Gregersen,
editor,* From Complexity to Life: On the Emergence of Life and Meaning*.
NY: Oxford University Press, 2003: 19-33.

[5] Peter
Grunwald & Paul Vitányi, "Shannon Information and Kolmogorov

Complexity" http://www.arxiv.org/abs/cs.IT/0410002
, accessed on October 5, 2004.

[6] Lev
D. Landau and Evgenyi M. Lifshits, *Statisticheskaya Fizika*
(Statistical Physics, in Russian; while the reference here is to the Russian
original, an English translation is available). Moscow, Nauka publishers,
1964: 38-44.

[7] Claude
E. Shannon. "A Mathematical Theory of Communication," parts 1 and 2, *Bell
System Technology Journal*, (July 1948}: 379-90; (October 1948):
623-37.

[8] Paul
Davies, "Introduction: Toward an Emergentist Worldview." In Niels Henrik Gregersen, editor, *From
Complexity to Life: On the Emergence of Life and Meaning*. NY: Oxford
University Press, 2003: 3-16.

[9] Paul
Davies. "Complexity and the Arrow of Time." In Niels Henrik Gregersen,
ed., *From Complexity to Life*, NY: Oxford University Press, 2003:
74.

[10] Mark
Perakh. *Unintelligent Design*. Prometheus Books, 2004 (chapter 1:
88-90).

[11] Mark
Perakh and Brendan McKay. "Letter Serial Correlation." In *Mark Perakh's
website*. http://members.cox.net/marperak/Texts/;
accessed on October 24, 2004.

[12] E.
Kokubo and S. Ida. "Oligarchic Growth of Protoplanets." *Icarus*, 127
(1990):171-178 (as referred by Stewart in the article reviewed here;
reference not verified).

Location of this article: http://www.talkreason.org/articles/Stewart.cfm