Ian Stewart:
Entropy vs. disorder and gravitics vs. thermodynamics
By Mark Perakh
Posted November 12, 2004
Ian Stewart is a distinguished British mathematician and
writer. This essay is a brief commentary on certain aspects of Ian Stewart's
article titled "The Second Law of Gravitics and the Fourth Law of
Thermodynamics" which is found in the
collection From Complexity to Life edited by Niels Henrik Gregersen. [1]
In two essays published previously I commented on the articles by Paul Davies [2]
and Charles H. Bennett [3] in the same collection, so this essay is the third
installment in the planned review of the entire collection (or at least of most
of its chapters).
Overall, Ian
Stewart's paper is thoughtprovoking and contains many points I agree with. I
have, however, a few reservations regarding some of Stewart's notions.
One such
notion concerns the relationship between entropy and "disorder." For example, Stewart renders his thesis as
follows (page 122):
"... order/disorder metaphor is
fraught with tacit assumptions (that are often wrong) and is inconsistent, ill
defined, and by no mean unique."
He also writes (page 144):
"Thermodynamic entropy is metaphorically identified with disorder.
However, the assumptions involved in such terminology are exceedingly dubious."
On pages 122123, Stewart discusses an example of gas that
expands from one corner of a container to fill the entire available volume, and
writes that the conventional interpretation of that process concludes (page
123) that "loss of order corresponds to gain of entropy, so entropy equals
disorder." However, in Stewart's view such an interpretation is far from being
universal. As I understand Stewart's notions, in his view the distinction
between "order" and "disorder" has no rigorous definition but only an intuitive
understanding which is one of the reasons that the definition of entropy as a
measure of disorder is not wellsubstantiated.
To my mind,
this Stewart's conclusion is not quite convincing.
Indeed, if
there is no rigorous and commonly accepted definition of "disorder" except for
an intuitive understanding (as Stewart seems to think) then whatever definition
of entropy were chosen, it couldn't contradict the nonexistent
definition of disorder. If this were the case, what would prevent us from choosing
entropy to be a measure of "disorder" by carefully defining the connection
between the two concepts? Adopting entropy
as a measure of disorder is not prohibited by any known laws of physics or theorems
of information theory.
In fact,
however, there are definitions of disorder. Therefore, in order to decide whether
defining entropy as a measure of disorder is substantiated, we have to see if
such a definition is compatible with the existing definitions of
disorder.
In my view,
the definition of entropy as a measure of disorder can quite well fit some of the
existing concepts of disorder. In particular, a consistent and fruitful
definition of disorder is given in the algorithmic theory of complexity/information
by Solomonoff, Kolmogorov, and Chaitin (SKC), whose seminal concepts have been
explained, for example, in the elegant article by Gregory Chaitin reprinted in
the same collection.[4]
The SKC
theory ties together complexity, randomness, and information (although
"Kolmogorov information" differs from Shannon information [5]). It is easiest
to apply it to informational systems, e.g. to texts.
Let us
represent the system under investigation as a text. A sufficiently complete description of the system in plain words
and/or mathematical formulae can serve as the source of such a text. This
description can always be transcribed in binary symbols, i.e. as a string of
zeroes and ones. (A simple example is representing a text, written in any
conventional alphabet, by Morse alphabet, wherein each dot represents a 0, and
each dash a 1).
In terms of
algorithmic information theory, a truly random (i.e. disordered) text
cannot be compressed preserving the amount of information it carries. In
terms of classical information theory a random text has zero redundancy.
Redundancy is a singlevalued function of the text's Shannon's entropy (the
latter measured in bits per character):
R=1 S_{act}/
S_{max }
where R
is redundancy (a dimensionless quantity, 0≤R<1), S_{max} is
the maximum possible specific entropy of a text (i.e. the entropy per character
of a perfectly random text whose redundancy is R=0) which equals S_{max}=log_{2}
N. N is the number of symbols available in the alphabet. S_{act }is the
specific entropy of the actual specimen of text which, unlike the perfectly
random text, possesses a certain redundancy R>0.
To my mind,
the definition of entropy as a measure of disorder (i.e. of degree of
randomness) is fully compatible with the concept of disorder (randomness) as
defined in the algorithmic theory of information.
Is the same
true for thermodynamic entropy? First, recall that Shannon's formula for text's
entropy (also referred to as Shannon's uncertainty) is the same as
BoltzmannGibbs's formula for thermodynamic entropy except for a constant. This
formula is
_{}.
K is the constant which in Shannon's version is
dimensionless and taken to be K=1 while in GibbsBoltzmann's version K is the
Boltzmann coefficient measured in Joule/Kelvin. Also, in GibbsBoltzmann's
version, natural logarithms are traditionally used while in Shannon's version
it most often is a logarithm to the base of 2, which results in measuring H in bits
per character (although natural logarithms are occasionally used as well in
which case instead of bits the units are called nats or nits).
Finally, in Shannon's version p_{i}is probability of an "event" (usually the event is the occurrence of a
given character at a specific location in a string of characters), while in
GibbsBoltzmann's version p_{i} is probability of an accessible
thermodynamic microstate.
The
mentioned differences between Shannon's and thermodynamic entropies do not
affect the principal interpretation of entropy as a measure of disorder. (In statistical
physics [6] entropy is treated as an essentially dimensionless quantity, thus
erasing its most salient difference from Shannon's uncertainty).
In order to interpret thermodynamic
entropy as a measure of disorder, it is useful to review the behavior of
"parameters" of the system. Generally speaking, each system is affected by an
immense number of parameters, but only a fraction of them exert a substantial
influence on the system's behavior. Choosing the set of a system's essential
parameters means selecting those few parameters which are assumed to
principally affect the system's behavior, and ignoring a vast number of other
parameters assumed to play an insignificant role in that behavior. The selected
"crucial" parameters must be given a quantitative representation. For example,
for a system which is a certain amount of ideal gas of mass m, the
essential parameters can be reasonably limited to only three – pressure,
volume, temperature – P,V,T. Then the system's degree of order may be defined
as the degree to which the parameters have gradients.
In a
perfectly "disordered" (randomized) system there are no gradients of the
appropriate parameters. Of course order is a matter of degree. A system can be
more or less ordered (or disordered).
There can never be certainty that a system is perfectly disordered
(random). However, it is principally feasible to assert that a system is not
random – to this end it is sufficient to point to any nonzero gradients of its
parameters (for example, for the ideal gas of mass m, the gradients in
question may be such as dP/dx>0, dV/dx>0, dT/dx>0, dP/dy>0,
dV/dy>0, dT/dy>0, dP/dz>0, dV/dz>0, and/or dT/dz>0,
etc.
I believe this
approach in no way contradicts SolomonoffKolmogorovChaitin's concepts of
randomness and complexity.
In view of
the above, I submit that it is possible to assert that the thermodynamic
entropy (treated statistically) is essentially the same concept as
informational entropy (i.e. as Shannon's uncertainty) and can reasonably be
chosen as a measure of disorder as well.
Now turn to
another point in Stewart's paper. On page 142 we read, "In information theory
(Shannon & Weaver 1964) there is a quantity called entropy that is
the negative of information."
Since I
don't have at hand Shannon & Weaver's book of 1964, I'll refer to Shannon's
original work of 1948, titled A Mathematical Theory of Communication.[7]
If we look it up we'll discover that entropy, in Shannon's original interpretation,
could hardly be consistently interpreted as the negative of information. In
fact, Shannon had defined entropy as the average information per
character in a string, and it is this definition which reflects the contents of
the above formula for H. I don't see how the quantity which is the average
information can simultaneously be construed a negative of information as a
general interpretation.
Perhaps the
confusion may stem from the view of entropy as a property of a system. In fact
the only consistent interpretation of entropy is viewing it as a measure of
disorder but not as a property of either the system or of its
components. As a measure, entropy of, say, gas, is not a property of the
molecules but only a measure of the degree of randomness of their distribution
over the container's volume (and/or of the distribution of molecules'
momenta). Likewise, Shannon entropy of
a string is not a property of characters of which the string is composed but
the measure of randomness of the characters' distribution along the string.
A measuring
tape can be used to measure, say, my height. However, nobody would think that
the tape is my height, or that the tape is a part of my organism.
Likewise, entropy can be used to measure the loss of information but it
does not make it the negative of information as a definition. Entropy (i.e., average
information in Shannon's conceptual system) can be used to measure the decrease
of uncertainty at the receiving end of a communication chain but it does not make
it the negative of information either. It can equally be used to measure the
amount of information carried out from the source of a sent message (whereas it
is not necessarily accompanied by the loss of information at the source;
sending information out may not cause a loss of information at the source). Entropy
and information have no "plus" or "minus" sign inherently attached – they can
be construed either positive or negative depending on the vantage point. What
is negative for John may be positive for Mary.
Indeed, continuing
his discourse Stewart denies the quoted interpretation of Shannon's entropy as
the negative information. For example, on page 143 we read,
"...in information theory, the
information 'in' a message is not negative informationtheoretic entropy.
Indeed, the entropy of the source remains unchanged, no matter how many
messages it generates ."
I agree with that statement. Note that it is at odds with the concepts affirmed in the papers by
Paul Davies [8, 9] in the same collection, as is discussed in my review of
Davies's first paper [2]. (I intend to write up a discussion of Davies' second
paper [9] wherein, in particular, to argue against Davies's thesis regarding
entropy as being a negative of information; I plan to specifically discuss
there Davies's example of information's and entropy's changes in the process of
achieving equilibrium between portions of a gas that initially had different temperatures.)
I also believe that accepting the notion of
entropy as a measure rather than a property makes attempts to
define a meaningful "law of conservation of information" hopeless – such a law
would be meaningless as information is not a commodity which can be conserved,
and the above quotation from page 143 of Stewart's article is in agreement with
my statement. (Another contributor to the collection in question, William
Dembski, has been promoting for several years his version of the "law of the
conservation of information" which he claims to be the "4th law of
thermodynamics." Besides the general fault of such attempts because
conservation of information is a concept void of a meaningful interpretation, the
alleged law, in Dembski's formulation,
contradicts the 2nd law of thermodynamics, as discussed
elsewhere [10]).
Back to the
role of thermodynamic entropy as a measure of disorder, let us note that
entropy of texts (including molecular chains such as DNA and the like) which is
a reflection of the degree of text's incompressibility in KolmogorovChaitin sense,
on the other hand can be naturally tied to gradients of appropriate parameters.
For
example, as was shown [11], semantically meaningful texts, regardless of
language, authorship, etc., display what we called Letter Serial Correlation
(LSC) which is absent in gibberish texts. In particular, LSC in semantically
meaningful strings results in the existence of what we called the Average Domain
of Minimal Letter Variability which is not found in gibberish strings. Along the semantically meaningful string
there is a gradient of a parameter which we called Letter Serial Correlation
sum, S_{m. }No such gradients are observed in "disordered" texts, i.e.
in randomized gibberish, where S_{m} displays haphazard variations
along the string. In highly ordered but meaningless strings of symbols gradients
of S_{m} often display wavelike regular variations along the string.
In semantically meaningful strings gradients of S_{m} are also variable
along the text, but the variations of S_{m
}along the string display a peculiar, easily recognizable shape, qualitatively
identical for all meaningful texts regardless of language but differing
quantitatively depending on the language and specific texts.
Accordingly,
the entropy of texts is the highest for chaotic gibberish "texts" where S_{m
}varies haphazardly so no gradients emerge (except for accidental local
minigradients). It is minimal in highly ordered meaningless texts where
gradient of S_{m} varies along the string in a regular wavelike
manner. For semantically meaningful texts entropy has intermediate values.
Therefore it also seems reasonable to define entropy as a measure of disorder
for texts from the standpoint of gradients. (There seem to be a certain
limitation to this interpretation, pertaining to ideally ordered meaningless
strings; it is, however, due to the peculiar choice of S_{m }as the
parameter in question rather than to the overall gist of the argument. This
limitation can be removed by choosing, instead of S_{m}, another,
slightly different parameter; this point will not be further discussed here
because of its extraneous character for this discourse).
For
thermodynamic systems, to consistently apply the above definition of entropy as
a measure of disorder we have to look at the behavior of appropriate thermodynamic
parameters as well, and if we discover gradients, we define the system as
having a degree of order. If a process results in steeper gradients, whether
constant or variable, entropy decreases (as well as the degree of
disorder).
The 2nd
law of thermodynamics requires increase of entropy only in closed systems, i.e.
as a result of spontaneous processes only. If a process is imposed on a
system, entropy can behave in different ways and the 2nd law does
not prohibit its decrease (while also not asserting its inevitability).
Stewart considers
an example of gas gathering in a corner of a container. The gathering of gas
molecules in a corner of a container, while not entirely impossible, is highly
unlikely to occur spontaneously. We conclude that a spontaneous process most
likely results in gas spreading all over the container's volume, and is
accompanied by entropy increase and a reduction (up to the almost complete
elimination) of gradients of density, temperature, and pressure (except for
local fluctuations). Nothing prohibits interpreting this as an increase of
disorder, and such a definition does not contradict the definition of entropy
as a measure of disorder. If, though,
molecules are driven into a corner by external force, entropy of gas decreases
and we may legitimately assert that the order within the system increases as
well; such an assertion will be in tune with the interpretation of disorder
both from the standpoint of KolmogorovChaitin's incompressibility of a string
representing the system and from the standpoint of gradients as measures of order.
Stewart
suggests that a Fourth law of thermodynamics may indeed exist, but in his view
it is not the putative law of conservation of information (which, I believe,
belongs in crank science) but rather a more plausible "2nd law of
gravitics."
I enjoyed
Stewart's discourse where he suggested that putative new law, although I have
some reservations. One very minor reproach I could offer is about Stewart's
reference (page 116) to "standard three laws of thermodynamics." It is an odd
reference because there are not three but four "standard" laws of
thermodynamics (the zeroth, the first, the second, and the third). An additional law of thermodynamics, if ever
established, would indeed be named the 4th law, but not because there
are only three such laws so far, but because of the peculiar way the existing
four laws have been historically named.
The "2nd
law of gravitics" which, according to Stewart, is so far just a "placeholder"
for the future law supplementing the existing laws of thermodynamics and which
is expected to reside outside of thermodynamics, is formulated by Stewart (page
146):
As time passes (forwards
according to the thermodynamic arrow) gravitic systems tend to become more and
more clumped.
To my mind, the above statement sounds more like an
observation, i.e., a statement of fact than like what usually is given the
status of a law of science. Of course, in a broad sense any statement of facts
may be claimed a law, and to my mind this is a legitimate usage of terms. Say,
"at a pressure of about 10^{5} Pascal, ice melts at 273.16 Kelvin" is a
statement of fact derived from observation and may legitimately be viewed as a
scientific "law." However, if we survey the conventional practice, the
honorific title of "law" is more often reserved for more profound statements,
especially if such a statement is offered as something on a par with the 2nd
law of thermodynamics. This is, of
course, a minor point, more so because Stewart adds:
"Observations support this law –
but what causes the clumping?"
Stewart
suggests that clumping results from "scaling symmetry" of the law of
gravitation. This topic is beyond the scope of my commentary, so all I can say
that I tend to accept the notions about "scaling symmetry" and "oligarchic
growth" [12] of clusters of matter, but all this, in my view, is yet
insufficient to justify introduction of a new and very profound "law" which
would complement the four known laws of thermodynamics. While Stewart's ideas
regarding the possible law serving as a substitute for the 4^{th} law
of thermodynamics are, in my view, of interest and deserve further analysis, so
far I remain unconvinced that the above "2nd law of gravitics" in
that formulation has indeed accomplished the task. Of course I have no
objection to viewing it as a "place holder" for a possible future law.
REFERENCES
[1]
Ian Stewart. "The Second Law of Gravitics and the Fourth Law of
Thermodynamics." In Niels Henrik Gregersen, editor, From Complexity to
Life: On the Emergence of Life and Meaning. NY: Oxford University
Press, 2003: 114150.
[2] Mark
Perakh. "Paul Davies: Emergentist vs. Reductionist." Talk Reason, www.talkreason.org/articles/Davies1.cfm
or http://members.cox.net/perakm/davies.htm. Posted on September 25, 2004.
[3] Mark
Perakh. "Defining Compexity." Talk Reason, www.talkreason.org/articles/complexity.pdf or http://members.cox.net/perakm/complexity_ben.htm.
Posted on August 12, 2004.
[4] Gregory
J. Chaitin. "Randomness and a Mathematical Proof." In Niels Henrik Gregersen,
editor, From Complexity to Life: On the Emergence of Life and Meaning.
NY: Oxford University Press, 2003: 1933.
[5] Peter
Grunwald & Paul Vitányi, "Shannon Information and Kolmogorov
Complexity" http://www.arxiv.org/abs/cs.IT/0410002
, accessed on October 5, 2004.
[6] Lev
D. Landau and Evgenyi M. Lifshits, Statisticheskaya Fizika
(Statistical Physics, in Russian; while the reference here is to the Russian
original, an English translation is available). Moscow, Nauka publishers,
1964: 3844.
[7] Claude
E. Shannon. "A Mathematical Theory of Communication," parts 1 and 2, Bell
System Technology Journal, (July 1948}: 37990; (October 1948):
62337.
[8] Paul
Davies, "Introduction: Toward an Emergentist Worldview." In Niels Henrik Gregersen, editor, From
Complexity to Life: On the Emergence of Life and Meaning. NY: Oxford
University Press, 2003: 316.
[9] Paul
Davies. "Complexity and the Arrow of Time." In Niels Henrik Gregersen,
ed., From Complexity to Life, NY: Oxford University Press, 2003:
74.
[10] Mark
Perakh. Unintelligent Design. Prometheus Books, 2004 (chapter 1:
8890).
[11] Mark
Perakh and Brendan McKay. "Letter Serial Correlation." In Mark Perakh's
website. http://members.cox.net/marperak/Texts/;
accessed on October 24, 2004.
[12] E.
Kokubo and S. Ida. "Oligarchic Growth of Protoplanets." Icarus, 127
(1990):171178 (as referred by Stewart in the article reviewed here;
reference not verified).
