<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Intellectual Trespassing: Math Sciences Edition</title>
	<atom:link href="http://www.mathblog.ellerman.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mathblog.ellerman.org</link>
	<description>An interdisciplinary blog on logic and mathematics</description>
	<lastBuildDate>Wed, 01 Feb 2012 19:23:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>A Very Common Fallacy in Quantum Mechanics</title>
		<link>http://www.mathblog.ellerman.org/2011/11/a-common-qm-fallacy/</link>
		<comments>http://www.mathblog.ellerman.org/2011/11/a-common-qm-fallacy/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 04:26:02 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Philosophy]]></category>
		<category><![CDATA[Quantum Mechanics]]></category>
		<category><![CDATA[delayed choice]]></category>
		<category><![CDATA[double slit]]></category>
		<category><![CDATA[measurement]]></category>
		<category><![CDATA[quantum eraser]]></category>
		<category><![CDATA[retrocausality]]></category>
		<category><![CDATA[separation fallacy]]></category>
		<category><![CDATA[Stern-Gerlach]]></category>
		<category><![CDATA[superposition]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=54</guid>
		<description><![CDATA[There is a very common fallacy, here called the separation fallacy, that is involved in the interpretation of quantum experiments involving a certain type of separation such as the: double-slit experiments, which-way interferometer experiments, polarization analyzer experiments, Stern-Gerlach experiments, and quantum eraser experiments. In each case, given an incoming quantum particle, the apparatus creates a [...]]]></description>
			<content:encoded><![CDATA[<p>There is a very common fallacy, here called the <em>separation fallacy</em>, that is involved in the interpretation of quantum experiments involving a certain type of separation such as the:</p>
<ul>
<li>double-slit experiments,</li>
<li>which-way interferometer experiments,</li>
<li>polarization analyzer experiments,</li>
<li>Stern-Gerlach experiments, and</li>
<li>quantum eraser experiments.</li>
</ul>
<p>In each case, given an incoming quantum particle, the apparatus creates a labelled or tagged (a type of entanglement) superposition of certain eigenstates (the &#8220;separation&#8221;). Detectors can be placed in certain positions so that when the evolving superposition state is finally projected or collapsed by the detectors, then only one of the eigenstates can register at each detector (due to the labels or tags). The <em>separation fallacy</em> mistakes the creation of a tagged or entangled superposition for a measurement. Thus it treats the particle as if it had already been projected or collapsed to an eigenstate at the separation apparatus rather than at the later detectors. But if the detectors were suddenly removed while the particle was in the apparatus, then the superposition would continue to evolve and have distinct effects (e.g., interference patterns in the two-slit experiment).</p>
<p>Hence the separation fallacy makes it seem that by the delayed choice to insert or remove the appropriately positioned detectors, one can<em> retro-cause</em> either a projection to an eigenstate or not at the particle&#8217;s entrance into the separation apparatus.</p>
<p>The separation fallacy is remedied by:</p>
<ul>
<li>taking superposition seriously, i.e., by seeing that the separation apparatus created an entangled superposition state of the alternatives (regardless of what happens later) which evolves until a measurement is taken, and</li>
<li>taking into account the role of detector placement (&#8220;contextuality&#8221;), i.e., by seeing that if a suitably positioned detector, as determined by the tags, can only detect one collapsed eigenstate, then it does not mean that the particle was already in that eigenstate prior to the measurement (e.g., it does not mean that the particle went through one slit, took one path in an interferometer, or was already in a polarization or spin eigenstate).</li>
</ul>
<p>The separation fallacy will be first illustrated in a non-technical manner for the first four experiments. Then the lessons will be applied in a slightly more technical discussion of quantum eraser experiments where the labels or tags are erased after the separation apparatus and where, due to the separation fallacy, incorrect inferences about retrocausality have been rampant.</p>
<p><span id="more-54"></span></p>
<h2>The double-slit experiment</h2>
<p>In the well-known setup for the <a href="http://en.wikipedia.org/wiki/Double-slit_experiment">double-slit experiment</a>, if a detector D₁ is placed a small distance after slit 1 so a particle &#8220;going through the other slit&#8221; cannot reach the detector, then a hit at the detector is usually interpreted as &#8220;the particle went through slit 1.&#8221;</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/two-slit.jpg"><img class="aligncenter size-full wp-image-56" title="two-slit" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/two-slit.jpg" alt="" width="145" height="147" /></a></p>
<p>But this is wrong. The particle is in a superposition state, which we might represent as <img src='http://s.wordpress.com/latex.php?latex=%7CSlit1%5Crangle%20%2B%20%7CSlit2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|Slit1\rangle + |Slit2\rangle' title='|Slit1\rangle + |Slit2\rangle' class='latex' />, that evolves until it hits the detector which projects (or &#8216;collapses&#8217;) the superposition to one of (the evolved versions of) the slit-eigenstates. The particle&#8217;s state was not collapsed earlier so it was not previously in the <img src='http://s.wordpress.com/latex.php?latex=%7CSlit1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|Slit1\rangle' title='|Slit1\rangle' class='latex' /> eigenstate, i.e., it did <em>not</em> &#8220;go through slit 1.&#8221;</p>
<p>Thus what is called &#8220;detecting which slit the particle went through&#8221; is a misinterpretation. It is only placing a detector in such a position so that when the superposition projects to an eigenstate, only one of the eigenstates can register in that detector. It is about detector placement; it is not about which-slit.</p>
<p>By erroneously talking about the detector &#8220;showing the particle went through slit 1,&#8221; we imply a type of retro-causality. If the detector is suddenly removed after the particle has passed the slits, then the superposition state continues to evolve and shows interference on the far wall (not shown)—in which case people say &#8220;the particle went through both slits.&#8221; Thus the &#8220;bad talk&#8221; makes it seem that by removing or inserting the detector after the particle is beyond the slits, one can retro-cause the particle to go through both slits or one slit only.</p>
<p>This sudden removal or insertion of detectors that can only detect one of the slit-eigenstates is a version of <a href="http://en.wikipedia.org/wiki/Wheeler%27s_delayed_choice_experiment">Wheeler&#8217;s delayed choice thought-experiment</a> [Wheeler 1978].</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Wheeler-delayed-2slit.jpg"><img class="aligncenter size-medium wp-image-77" title="Wheeler-delayed-2slit" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Wheeler-delayed-2slit-300x114.jpg" alt="" width="300" height="114" /></a></p>
<p>In Wheeler&#8217;s version of the experiment, there are two detectors which are positioned behind the removable screen so they can only detect one of the projected (evolved) slit eigenstates when the screen is removed. The choice to remove the screen or not is delayed until after a photon has traversed the two slits.</p>
<blockquote><p>&#8220;In the one case [screen in place] the quantum will &#8230; contribute to the record of a two-slit interference fringe. In the other case [screen removed] one of the two counters will go off and signal in which beam&#8211;and therefore from which slit&#8211;the photon has arrived.&#8221; [Wheeler 1978, p. 13]</p></blockquote>
<p>The separation fallacy is involved when Wheeler infers from the fact that one of the specially-placed detectors went off&#8211;that the photon had come from one of the slits as if there had been a projection to one of the slit eigenstates at the slits rather than later at the detectors.</p>
<p>Similar examples abound in the literature. For instance, concerning the quasar-galaxy version of Wheeler&#8217;s delayed choice experiment, Anton Zeilinger remarks:</p>
<blockquote><p>We decide, by choosing the measuring device, which phenomenon can become reality and which one cannot. Wheeler explicates this by example of the well-known case of a quasar, of which we can see two pictures through the gravity lens action of a galaxy that lies between the quasar and ourselves. By choosing which instrument to use for observing the light coming from that quasar, we can decide here and now whether the quantum phenomenon in which the photons take part is interference of amplitudes passing on both side of the galaxy or whether we determine the path the photon took on one or the other side of the galaxy. [Zeilinger 2008, pp. 191-192]</p></blockquote>
<p>Occasionally instead of stating that future actions can determine whether the particle passes &#8220;on both sides of the galaxy&#8221; (or through both slits) or only &#8220;on one or the other side&#8221; (or through only one slit), the euphemism is used of saying the photon acts like a wave or particle depending on the future actions.</p>
<blockquote><p>The important conclusion is that, while individual events just happen, their physical interpretation in terms of wave or particle might depend on the future; it might particularly depend on decisions we might make in the future concerning the measurement performed at some distant spacetime location in the future. [Zeilinger 2004, p. 207]</p></blockquote>
<p>These descriptions using the separation fallacy are unfortunately common and have generated a spate of speculations about retrocausality.</p>
<h2>Which-way interferometer experiments</h2>
<p>Consider an interferometer with only one beam-splitter (e.g., half-silvered mirror) at the photon source which creates the superposition: <img src='http://s.wordpress.com/latex.php?latex=%7CLowerArm%5Crangle%20%2B%20%7CUpperArm%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|LowerArm\rangle + |UpperArm\rangle' title='|LowerArm\rangle + |UpperArm\rangle' class='latex' />.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometer1bs.jpg"><img class="aligncenter size-medium wp-image-57" title="interferometer1bs" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometer1bs-300x203.jpg" alt="" width="300" height="203" /></a></p>
<p>When detector <img src='http://s.wordpress.com/latex.php?latex=D_%7B1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{1}' title='D_{1}' class='latex' /> registers a hit, it is <em>said</em> that &#8220;the photon took the lower arm&#8221; of the interferometer and similarly for <img src='http://s.wordpress.com/latex.php?latex=D_%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{2}' title='D_{2}' class='latex' /> and the upper arm. This is the interferometer analogue of putting two up-close detectors after the 2 slits in the 2-slit experiment.</p>
<p>And this standard description is wrong for the same reasons. The photon stays in the superposition state until the detectors force a projection to one of the (evolved) eigenstates. If the projection is to the evolved <img src='http://s.wordpress.com/latex.php?latex=%7CLowerArm%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|LowerArm\rangle' title='|LowerArm\rangle' class='latex' /> eigenstate then only <img src='http://s.wordpress.com/latex.php?latex=D_%7B1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{1}' title='D_{1}' class='latex' /> will get a hit, and similarly for <img src='http://s.wordpress.com/latex.php?latex=D_%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{2}' title='D_{2}' class='latex' /> and the upper arm. The point is that the placement of the detectors (like in the double-slit experiment) only captures one or the other of the projected eigenstates.</p>
<p>Now insert a second beam-splitter as in the following diagram.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometer2bs.jpg"><img class="aligncenter size-medium wp-image-58" title="interferometer2bs" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometer2bs-300x203.jpg" alt="" width="300" height="203" /></a></p>
<p>It is <em>said</em> that the second beam-splitter &#8220;erases&#8221; the &#8220;which-way information&#8221; so that a hit at either detector could have come from either arm, and thus an interference pattern emerges.</p>
<p>But this is wrong. The evolving superposition state <img src='http://s.wordpress.com/latex.php?latex=%7CLowerArm%5Crangle%20%2B%20%7CUpperArm%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|LowerArm\rangle + |UpperArm\rangle' title='|LowerArm\rangle + |UpperArm\rangle' class='latex' /> (which contains no which-way information) was always there until the detectors. The so-called &#8220;which-way information&#8221; was not there to be &#8220;erased&#8221; since the particle did not take one way or the other in the first place. The second beam-splitter only allows the two projected eigenstates from the superposition to be measured at <em>each</em> detector so, by shifting the phase <img src='http://s.wordpress.com/latex.php?latex=%5Cphi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\phi' title='\phi' class='latex' />, an interference pattern can be recorded at each detector.</p>
<p>By inserting or removing the second beam-splitter after the particle has traversed the first beam-splitter, the &#8220;bad talk&#8221; makes it seem that we can retro-cause the particle to go through both arms or only one arm.</p>
<p>Instead of inserting the second beam-splitter, we could rig up more mirrors, a lense, and a detector so that when the detector causes the collapse, then it is will register <em>either</em> arm-eigenstate.</p>
<p style="text-align: center;"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometerlens3.jpg"><img class="aligncenter size-medium wp-image-73" title="interferometerlens" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/interferometerlens3-300x145.jpg" alt="" width="300" height="145" /></a></p>
<p>This might also be (mis)interpreted as &#8220;erasing&#8221; the &#8220;which-way information&#8221; but in fact the photon did not go through just one arm so there was no such information to be erased. The point is the positioning of the detector so that when the evolved superposition <img src='http://s.wordpress.com/latex.php?latex=%7CLowerArm%5Crangle%20%2B%20%7CUpperArm%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|LowerArm\rangle + |UpperArm\rangle' title='|LowerArm\rangle + |UpperArm\rangle' class='latex' /> is projected to one of the eigenstates, then <em>both</em> are detected. Any setup that would allow a detector to register <em>both</em> collapsed arm-eigenstates (and<em> thus</em> to register the interference effects of the evolving superposition) would be a setup that could be (mis)interpreted as &#8220;erasing&#8221; the &#8220;which-way information.&#8221;</p>
<h2>Polarization analyzers and loops</h2>
<p>Another common textbook example of the separation fallacy is the treatment of polarization analyzers such as calcite crystals that are <em>said</em> to create two orthogonally polarized beams in the upper and lower channels, say <img src='http://s.wordpress.com/latex.php?latex=%7Cv%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|v\rangle' title='|v\rangle' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Ch%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|h\rangle' title='|h\rangle' class='latex' /> from an arbitrary incident beam.<a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/xy-analyzer.jpg"><img class="aligncenter size-medium wp-image-60" title="xy-analyzer" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/xy-analyzer-300x47.jpg" alt="" width="440" height="68" /></a></p>
<p>The output beams from the analyzer <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> are routinely described as being &#8220;vertically polarized&#8221; and &#8220;horizontally polarized&#8221; as if the analyzer was a measurement that collapsed or projected the incident beam to either of those polarization eigenstates. This <em>seems</em> to follow because if one positions a detector in the upper beam then only vertically polarized photons are observed and similarly for the lower beam and horizontally polarized photons. A blocking mask in one of the beams has the same effect as a detector to project the photons to eigenstates. If a blocking mask in inserted in the lower beam, then only vertically polarized photons will be found in the upper beam, and vice-versa.</p>
<p>But here again, the story is about detector (or blocking mask) placement; it is not about the analyzer supposedly projecting a photon into one or the other of the eigenstates. The analyzer puts the incident photons into a superposition state. If a detector is placed in, say, the upper beam, then<em> that</em> is the measurement that collapses the evolved superposition state. If the collapse is to the vertical polarization eigenstate then it will register only in the upper detector and similarly for a collapse to the horizontal polarization eigenstate for any detector placed in the lower position. Thus it is misleadingly said that the upper beam was already vertically polarized and the lower beam was already horizontally polarized as if the analyzer had already done the projection to one of the eigenstates.</p>
<p>If the analyzer had in fact performed the measurement collapsing to the eigenstates, then any prior polarization of the incident beam would be lost. Hence assume that the incident beam was prepared in a specific polarization of, say, <img src='http://s.wordpress.com/latex.php?latex=%7C45%5E%7B%5Ccirc%7D%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|45^{\circ}\rangle' title='|45^{\circ}\rangle' class='latex' /> half-way between the states of vertical and horizontal polarization. Then follow the vh-analyzer <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> with its inverse <img src='http://s.wordpress.com/latex.php?latex=P%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P^{-1}' title='P^{-1}' class='latex' /> to form an <em>analyzer loop </em>[French and Taylor 1978].<a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/xy-analyzerloop.jpg"><img class="aligncenter size-medium wp-image-61" title="xy-analyzerloop" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/xy-analyzerloop-300x70.jpg" alt="" width="300" height="70" /></a></p>
<p>The characteristic feature of an analyzer loop is that it outputs the same polarization, in this case <img src='http://s.wordpress.com/latex.php?latex=%7C45%5E%7B%5Ccirc%7D%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|45^{\circ}\rangle' title='|45^{\circ}\rangle' class='latex' />, as the incident beam. This would be impossible if the <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> analyzer had in fact rendered all the photons into a vertical or horizontal eigenstate thereby destroying the information about the polarization of the incident beam. But since no collapsing measurement was in fact made in <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> or its inverse, the original beam can be the output of an analyzer loop.</p>
<p>Very few textbooks realize there is even a problem with presenting a polarization analyzer such as a calcite crystal as creating two beams with eigenstate polarizations—rather than creating a superposition state so that appropriately positioned detectors can detect only one eigenstate when the detectors cause the projections to eigenstates.</p>
<p>One (partial) exception is Dicke and Wittke&#8217;s text [1960]. At first they present polarization analyzers <em>as if</em> they measured polarization and thus &#8220;destroyed completely any information that we had about the polarization&#8221; [p. 118] of the incident beam. But then they note a problem:</p>
<blockquote><p>&#8220;The equipment [polarization analyzers] has been described in terms of devices which measure the polarization of a photon. Strictly speaking, this is not quite accurate.&#8221; [p. 118]</p></blockquote>
<p>They then go on to consider the inverse analyzer <img src='http://s.wordpress.com/latex.php?latex=P%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P^{-1}' title='P^{-1}' class='latex' /> which combined with <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> will form an analyzer loop that just transmits the incident beam unchanged.</p>
<p>They have some trouble squaring this with their prior statement about the <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> analyzer destroying the polarization of the incident beam but they, unlike most texts, face up to the problem.</p>
<blockquote><p>&#8220;Stating it another way, although [when considered by itself] the polarization P completely destroyed the previous polarization Q [of the incident beam], making it impossible to predict the result of the outcome of a subsequent measurement of Q, in [the analyzer loop] the disturbance of the polarization which was effected by the box P is seen to be revocable: if the box P is combined with another box of the right type, the combination can be such as to leave the polarization Q unaffected.&#8221; [p. 119]</p></blockquote>
<p>They then go on to correctly note that the polarization analyzer <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> did not in fact project the incident photons into polarization eigenstates.</p>
<blockquote><p>&#8220;However, it should be noted that in this particular case [sic!], the first box P in [the first half of the analyzer loop] did not really measure the polarization of the photon: no determination was made of the channel (p1 or p2) which the photon followed in leaving the box P.&#8221; [p. 119]</p></blockquote>
<p>There is some classical imagery (like Schrodinger&#8217;s cat running around one side or the other side of a tree) that is sometimes used to illustrate quantum separation experiments when in fact it only illustrates how classical imagery can be misleading. Suppose an interstate highway separates at a city into both northern and southern bypass routes&#8211;like the two channels in a polarization analyzer loop. One can observe the bypass routes while a car is in transit and find that it is in one bypass route or another. But after the car transits whichever bypass it took without being observed and rejoins the undivided interstate, then it is said that the which-way information is erased so an observation cannot elicit that information.</p>
<p>This is<em> not</em> a correct description of the corresponding quantum separation experiment since the classical imagery does not contemplate superposition states. The particle-as-car is in a superposition of the two routes until an observation (e.g., a detector or &#8220;road-block&#8221;) collapses the superposition to one eigenstate or the other.  Correct descriptions of quantum separation experiments require taking superposition seriously&#8211;so classical imagery should only be used <em>cum grano salis</em>.</p>
<p>This analysis might be rendered in a more technical but highly schematic way. The photons in the incident beam have a particular polarization <img src='http://s.wordpress.com/latex.php?latex=%7C%5Cpsi%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\psi\rangle' title='|\psi\rangle' class='latex' /> such as <img src='http://s.wordpress.com/latex.php?latex=%7C45%5E%7B%5Ccirc%7D%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|45^{\circ}\rangle' title='|45^{\circ}\rangle' class='latex' /> in the above example. This polarization state can be represented or resolved in terms of the vh-basis as:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%7C%5Cpsi%5Crangle%20%3D%20%5Clangle%20v%7C%5Cpsi%5Crangle%7Cv%5Crangle%20%2B%20%5Clangle%20h%7C%5Cpsi%5Crangle%7Ch%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\psi\rangle = \langle v|\psi\rangle|v\rangle + \langle h|\psi\rangle|h\rangle' title='|\psi\rangle = \langle v|\psi\rangle|v\rangle + \langle h|\psi\rangle|h\rangle' class='latex' /> .</p>
<p>The effect of the vh-analyzer <img src='http://s.wordpress.com/latex.php?latex=P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P' title='P' class='latex' /> is then represented as tagging the vertical and horizontal polarization states with the upper and lower (or straight) channels so the vh-analyzer puts an incident photon into the superposition state:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Clangle%20v%7C%5Cpsi%5Crangle%7Cv%5Crangle_%7BU%7D%20%2B%20%5Clangle%20h%7C%5Cpsi%5Crangle%7Ch%5Crangle_%7BL%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\langle v|\psi\rangle|v\rangle_{U} + \langle h|\psi\rangle|h\rangle_{L}' title='\langle v|\psi\rangle|v\rangle_{U} + \langle h|\psi\rangle|h\rangle_{L}' class='latex' />.</p>
<p>If a blocker or detector were inserted in either channel, then this superposition state would project to one of the eigenstates, and then only vertically polarized photons would be found in the upper channel and horizontally polarized photons in the lower channel (as indicated by the tags).</p>
<p>The separation fallacy is to describe the vh-analyzer as if its effect by itself was to project an incident photon either into <img src='http://s.wordpress.com/latex.php?latex=%7Cv%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|v\rangle' title='|v\rangle' class='latex' /> in the upper channel or <img src='http://s.wordpress.com/latex.php?latex=%7Ch%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|h\rangle' title='|h\rangle' class='latex' /> in the lower channel&#8211;instead of only creating the above superposition state.  The mistake of describing the unmeasured polarization analyzer as creating two beams of eigenstate polarized photons is analogous to the mistake of describing a particle as going through one slit or the other in the unmeasured-at-slits double-slit experiment&#8211;and similarly for the other separation experiments.</p>
<p>It is fallacious to reason that &#8220;we know the photons are in one polarization state in one channel and in the orthogonal polarization state in the other channel <em>because</em> that is what we find when we measure the channels,&#8221; just as it is fallacious to reason &#8220;the particle has to go through one slit or another (or one arm or another in the interferometer experiment) <em>because</em> that is what we find when we measure it.&#8221; The purely operational (or &#8220;Copenhagen&#8221;) description (&#8220;what we find when we measure&#8221;) does not take superposition seriously.</p>
<p>In the analyzer<em> loop</em>, no measurement (detector or blocker) is made after the vh-analyzer. It is followed by the inverse vh-analyzer <img src='http://s.wordpress.com/latex.php?latex=P%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P^{-1}' title='P^{-1}' class='latex' /> which has the inverse effect of removing the <img src='http://s.wordpress.com/latex.php?latex=U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U' title='U' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=L&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='L' title='L' class='latex' /> tags so that a photon exits the loop in the state:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Clangle%20v%7C%5Cpsi%5Crangle%7Cv%5Crangle%20%2B%20%5Clangle%20h%7C%5Cpsi%5Crangle%7Ch%5Crangle%20%3D%20%7C%5Cpsi%5Crangle%20&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\langle v|\psi\rangle|v\rangle + \langle h|\psi\rangle|h\rangle = |\psi\rangle ' title='\langle v|\psi\rangle|v\rangle + \langle h|\psi\rangle|h\rangle = |\psi\rangle ' class='latex' />.</p>
<p style="text-align: left;">The inverse vh-analyzer does not &#8220;erase&#8221; the which-polarization information since there was no measurement to reduce the superposition state to eigenstate polarizations in the channels of the analyzer loop&#8211;in the first place.</p>
<h2>The Stern-Gerlach experiment</h2>
<p>We have seen the separation fallacy in the standard treatments of the double-slit experiment, which-way interferometer experiments, and in polarization analyzers. In spite of the differences between those separation experiments, there was that common (mis)interpretative theme of premature projection. Since the &#8220;logic&#8221; of the polarization analyzers is followed in the Stern-Gerlach experiment (with spin playing the role of polarization), it is not surprising that the same fallacy occurs there.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Stern-Gerlach1.jpg"><img class="aligncenter size-medium wp-image-71" title="Stern-Gerlach" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Stern-Gerlach1-300x139.jpg" alt="" width="300" height="139" /></a></p>
<p>And again, the fallacy is revealed by considering the Stern-Gerlach analogue of an analyzer loop. The idea of a Stern-Gerlach loop seems to have been first broached by David Bohm [1951, 22.11] and was later used by Eugene Wigner [1979]. One of the few texts to consider such a Stern-Gerlach analyzer loop is <em>The Feynman Lectures on Physics: Quantum Mechanics (Vol. III)</em> where it is called a &#8220;modified Stern-Gerlach apparatus&#8221; (p. 5-2).</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Stern-GerlachLoop1.jpg"><img class="aligncenter size-medium wp-image-72" title="Stern-GerlachLoop" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Stern-GerlachLoop1-300x102.jpg" alt="" width="300" height="102" /></a></p>
<p>Ordinarily texts represent the Stern-Gerlach apparatus as separating particles into spin eigenstates denoted by, say, <img src='http://s.wordpress.com/latex.php?latex=%2BS%2C0S%2C-S&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+S,0S,-S' title='+S,0S,-S' class='latex' />. But as in our other examples, the apparatus does <em>not</em> project the particles to eigenstates. Instead it creates a superposition state so that with a detector in a certain position, then as the detector causes the collapse to a spin eigenstate, the detector will only see particles of one spin state. Alternatively if the collapse is caused by placing blocking masks over two of the beams, then the particles in the third beam will all be those that have collapsed to the same eigenstate. It is the detectors or blockers that cause the collapse or projection to eigenstates, not the prior separation apparatus.</p>
<p>We previously saw how a polarization analyzer, contrary to the statement in many texts, does not lose the polarization information of the incident beam when it &#8220;separates&#8221; the beam. In the context of the Stern-Gerlach apparatus, Feynman similarly remarks:</p>
<blockquote><p>&#8220;Some people would say that in the filtering by T we have &#8216;lost the information&#8217; about the previous state (<img src='http://s.wordpress.com/latex.php?latex=%2BS&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+S' title='+S' class='latex' />) because we have &#8216;disturbed&#8217; the atoms when we separated them into three beams in the apparatus T. But that is not true. The past information is not lost by the <em>separation</em> into three beams, but by the <em>blocking masks</em> that are put in….&#8221; [p. 5-9, italics in original]</p></blockquote>
<h2>The Separation Fallacy</h2>
<p>We have seen the same fallacy of interpretation in two-slit experiments, which-way interferometer experiments, polarization analyzers, and Stern-Gerlach experiments. The common element in all the cases is that there is some &#8216;separation&#8217; apparatus that puts a particle into a certain superposition of eigenstates in such a manner that when an <em>appropriately positioned</em> detector induces a collapse to an eigenstate, then the detector will only register one of the eigenstates. The separation fallacy is that this is misinterpreted as showing that the particle was already in that eigenstate in that position as a result of the previous &#8216;separation.&#8217; The quantum erasers are elaborated versions of these simpler experiments, and a similar separation fallacy arises in that context.</p>
<h2>A simple quantum eraser experiment</h2>
<p style="text-align: left;">A simple quantum eraser can be devised using a single beam of photons as in Hilmer and Kwiat [2007]. We start with the standard two-slit setup.<br />
<a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-interf.jpg"><img class="aligncenter size-medium wp-image-74" title="Two-slit-interf" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-interf-300x210.jpg" alt="" width="300" height="210" /></a><br />
After the two slits, a photon could be schematically represented as being in a superposition state <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%2B%7Cs2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle+|s2\rangle' title='|s1\rangle+|s2\rangle' class='latex' /> (where <img src='http://s.wordpress.com/latex.php?latex=s1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='s1' title='s1' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=s2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='s2' title='s2' class='latex' /> stand for the two slits) which evolves with interference to give the familiar pattern on the far wall.</p>
<p style="text-align: left;">A <img src='http://s.wordpress.com/latex.php?latex=-45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='-45^{\circ}' title='-45^{\circ}' class='latex' /> polarizer might be placed in front of the two slits to control the incoming polarization to obtain a half-half split when a horizontal polarizer is placed behind of slit 1 and a vertical polarizer behind slit 2.<br />
<a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-vh-polarizers1.jpg"><img class="aligncenter size-medium wp-image-78" title="Two-slit-vh-polarizers" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-vh-polarizers1-300x210.jpg" alt="" width="300" height="210" /></a><br />
After the two slits and the polarizers, a photon is in a state that entangles the spatial slit states and the polarization states which might be represented as: <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%20%5Cotimes%20%7Ch%5Crangle%20%2B%7Cs2%5Crangle%20%5Cotimes%20%7Cv%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' title='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' class='latex' />. But as this superposition evolves, it cannot be separated into a superposition of the slit-states as before, so the interference disappears. The so-called &#8220;which slit&#8221; information is said to be marked with the polarization information.</p>
<p style="text-align: left;">Then a <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarizer is inserted between the two-slit screen and the wall. This transforms the evolving state to:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%5Cotimes%7C45%5E%7B%5Ccirc%7D%5Crangle%2B%7Cs2%5Crangle%5Cotimes%7C45%5E%7B%5Ccirc%7D%5Crangle%20%3D%20%5B%7Cs1%5Crangle%2B%7Cs2%5Crangle%5D%5Cotimes%20%7C45%5E%7B%5Ccirc%7D%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle\otimes|45^{\circ}\rangle+|s2\rangle\otimes|45^{\circ}\rangle = [|s1\rangle+|s2\rangle]\otimes |45^{\circ}\rangle' title='|s1\rangle\otimes|45^{\circ}\rangle+|s2\rangle\otimes|45^{\circ}\rangle = [|s1\rangle+|s2\rangle]\otimes |45^{\circ}\rangle' class='latex' /></p>
<p style="text-align: left;">so that the <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%2B%7Cs2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle+|s2\rangle' title='|s1\rangle+|s2\rangle' class='latex' /> term will show interference in a &#8220;fringe&#8221; pattern when the <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarized photons hit the wall. If we had inserted a <img src='http://s.wordpress.com/latex.php?latex=-45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='-45^{\circ}' title='-45^{\circ}' class='latex' /> polarizer, then again interference in an &#8220;antifringe&#8221; pattern would appear as the <img src='http://s.wordpress.com/latex.php?latex=-45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='-45^{\circ}' title='-45^{\circ}' class='latex' /> polarized photons hit the wall. The sum of the fringe and antifringe patterns gives the no-interference pattern of the previous figure.</p>
<p style="text-align: left;"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-45-polarizer1.jpg"><img class="aligncenter size-medium wp-image-79" title="Two-slit-45-polarizer" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/Two-slit-45-polarizer1-300x210.jpg" alt="" width="300" height="210" /></a><br />
A common description of this type of quantum eraser experiment is that the insertion of the h,v polarizers &#8220;marks&#8221; the photons with &#8220;which-slit information&#8221; (previous figure without <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarizer ) that destroys the interference&#8211;even if the horizontal or vertical polarization is not measured at the wall. If the horizontal or vertical polarization was measured at the wall, then the evolved superposition state <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%20%5Cotimes%20%7Ch%5Crangle%20%2B%7Cs2%5Crangle%20%5Cotimes%20%7Cv%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' title='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' class='latex' /> would collapse to the evolved version of <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle' title='|s1\rangle' class='latex' /> (if h was found) or <img src='http://s.wordpress.com/latex.php?latex=%7Cs2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s2\rangle' title='|s2\rangle' class='latex' /> (if v was found). This is said to reveal the so-called &#8220;which-slit information&#8221; that the photon went through slit 1 or slit 2, i.e., that at the slits, the photon was already in the state<img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle' title='|s1\rangle' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%7Cs2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s2\rangle' title='|s2\rangle' class='latex' /> instead of being in the superposition state. By incorrectly inferring that the photon was in one state or the other at the slits&#8211;while it would have to &#8220;go through both slits&#8221; to yield the interference pattern obtained by inserting the <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarizer&#8211;we seem to be able to retro-cause the particle to go through one slit or both slits by withdrawing or inserting the <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarizer after a photon has traversed the two slits.</p>
<p style="text-align: left;">It is precisely the separation fallacy that leads to this inference of retrocausality. In the situation of prevous figure (before inserting the <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> polarizer), a photon stays in a superposition state <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle%20%5Cotimes%20%7Ch%5Crangle%20%2B%7Cs2%5Crangle%20%5Cotimes%20%7Cv%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' title='|s1\rangle \otimes |h\rangle +|s2\rangle \otimes |v\rangle' class='latex' /> until it hits the wall. The slit states are indeed marked, tagged, labelled, or entangled with polarization states but this is incorrectly called &#8220;which-slit information&#8221; as if it could &#8220;reveal&#8221; that the photon was in the state  <img src='http://s.wordpress.com/latex.php?latex=%7Cs1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s1\rangle' title='|s1\rangle' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%7Cs2%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|s2\rangle' title='|s2\rangle' class='latex' /> at the slits, i.e., that it went through slit 1 or slit 2.</p>
<p style="text-align: left;">Also it might be noted that the insertion of a <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=-45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='-45^{\circ}' title='-45^{\circ}' class='latex' /> polarizer does not &#8220;restore&#8221; the original interference pattern  but picks out the fringe or antifringe interference patterns out of the previous &#8220;mush&#8221; of hits.</p>
<p><object width="400" height="400" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://chesters.org/marvin/SWFs/QMeraser.swf" /><param name="quality" value="high" /><param name="allowscriptaccess" value="sameDomain" /><param name="pluginspage" value="http://www.macromedia.com/go/getflashplayer" /><embed width="400" height="400" type="application/x-shockwave-flash" src="http://chesters.org/marvin/SWFs/QMeraser.swf" quality="high" allowscriptaccess="sameDomain" pluginspage="http://www.macromedia.com/go/getflashplayer" /></object><br />
<a href="http://www.chesters.org/marvin/index.html">Marvin Chester</a> has developed a user-drivable model of this quantum eraser experiment to illustrate how it works.</p>
<h2>Two photon quantum eraser experiment</h2>
<p>We now turn to one of the more elaborate <a href="http://en.wikipedia.org/wiki/Quantum_eraser_experiment">quantum eraser experiments</a> (the treatment in the Wikipedia link is muddled but it gives the <a href="http://grad.physics.sunysb.edu/%7Eamarch/Walborn.pdf">link</a> to the original paper by Walborn et al. [2002] for this experiment). As noted in the Walborn paper, their quantum eraser experiment is an optical realization of type of quantum eraser suggested by Scully et al. [1991] using maser cavities and atoms.</p>
<p>A photon hits a down-converter which emits a &#8220;signal&#8221; p-photon entangled with an &#8220;idler&#8221; s-photon with a superposition of orthogonal <img src='http://s.wordpress.com/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\rangle' title='|x\rangle' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7Cy%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|y\rangle' title='|y\rangle' class='latex' /> polarizations:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle%20%3D%20%5Cfrac%7B1%7D%7B%5Csqrt%7B2%7D%7D%20%5B%7Cx%5Crangle_%7Bs%7D%5Cotimes%20%7Cy%5Crangle_%7Bp%7D%2B%7Cy%5Crangle_%7Bs%7D%5Cotimes%7Cx%5Crangle_%7Bp%7D%5D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle = \frac{1}{\sqrt{2}} [|x\rangle_{s}\otimes |y\rangle_{p}+|y\rangle_{s}\otimes|x\rangle_{p}]' title='|\Psi\rangle = \frac{1}{\sqrt{2}} [|x\rangle_{s}\otimes |y\rangle_{p}+|y\rangle_{s}\otimes|x\rangle_{p}]' class='latex' />.</p>
<p>The lower s-photon hits a double-slit screen, and will show an interference pattern on the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> detector as the detector is moved along the x-axis.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser1.jpg"><img class="aligncenter size-medium wp-image-62" title="QEraser1" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser1-286x300.jpg" alt="" width="286" height="300" /></a></p>
<p>Next two quarter-wave plates are inserted before the 2-slit screen with the fast axis of the one over slit 1 oriented at <img src='http://s.wordpress.com/latex.php?latex=%2B45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='+45^{\circ}' title='+45^{\circ}' class='latex' /> to the x-axis and the one over the slit 2 with its fast axis oriented at <img src='http://s.wordpress.com/latex.php?latex=-45%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='-45^{\circ}' title='-45^{\circ}' class='latex' /> to the x-axis.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-plate.jpg"><img class="aligncenter size-medium wp-image-63" title="QEraser-plate" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-plate-286x300.jpg" alt="" width="286" height="300" /></a></p>
<p>Then Walborn et al. [2002] give the overall state of the system as (where the s1 and s2 tags refer to the two slits):</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle%20%3D%20%5Cfrac%7B1%7D%7B2%7D%5B%28%7CL%5Crangle_%7Bs1%7D%5Cotimes%7Cy%5Crangle_%7Bp%7D%2B%20i%7CR%5Crangle_%7Bs1%7D%5Cotimes%7Cx%5Crangle_%7Bp%7D%29%2B%20%28%7CR%5Crangle_%7Bs2%7D%5Cotimes%7Cy%5Crangle_%7Bp%7D-i%7CL%5Crangle_%7Bs2%7D%5Cotimes%7Cx%5Crangle_%7Bp%7D%29%5D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle = \frac{1}{2}[(|L\rangle_{s1}\otimes|y\rangle_{p}+ i|R\rangle_{s1}\otimes|x\rangle_{p})+ (|R\rangle_{s2}\otimes|y\rangle_{p}-i|L\rangle_{s2}\otimes|x\rangle_{p})]' title='|\Psi\rangle = \frac{1}{2}[(|L\rangle_{s1}\otimes|y\rangle_{p}+ i|R\rangle_{s1}\otimes|x\rangle_{p})+ (|R\rangle_{s2}\otimes|y\rangle_{p}-i|L\rangle_{s2}\otimes|x\rangle_{p})]' class='latex' />.</p>
<p>Then by measuring the linear polarization of the p-photon at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> and the circular polarization at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />, &#8220;which-slit information&#8221; is said to be obtained and no interference pattern recorded at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />.</p>
<p>For instance measuring <img src='http://s.wordpress.com/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\rangle' title='|x\rangle' class='latex' /> at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7CL%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|L\rangle' title='|L\rangle' class='latex' /> at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> imply s2, i.e., slit 2. But as previously explained, this does <em>not</em> mean that the s-photon went through slit 2. It means we have positioned the two detectors <em>in polarization space</em>, say to measure <img src='http://s.wordpress.com/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\rangle' title='|x\rangle' class='latex' /> polarization at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%7CL%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|L\rangle' title='|L\rangle' class='latex' /> polarization at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />, so only when the superposition state collapses to <img src='http://s.wordpress.com/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\rangle' title='|x\rangle' class='latex' /> for the p-photon and <img src='http://s.wordpress.com/latex.php?latex=%7CL%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|L\rangle' title='|L\rangle' class='latex' /> for the s-photon do we get a hit at both detectors.</p>
<p>This is the analogue of the one-beam-splitter interferometer where the positioning of the detectors would only record one collapsed state which did not imply the system was all along in that particular arm-eigenstate. The phrase &#8220;which-slit&#8221; or &#8220;which-arm information&#8221; is a misnomer in that it implies the system was already in a slit- or arm-eigenstate and the so-called measurement only revealed the information. Instead, it is <em>only</em> at the measurement that there is a collapse or projection to an evolved slit-eigenstate (not at the previous &#8216;separation&#8217; due to the two slits).</p>
<p>Walborn et al. indulge in the separation fallacy when they discuss what the so-called  &#8220;which-path information&#8221; reveals.</p>
<blockquote><p>Let us consider the first possibility [detecting p before s]. If photon p is detected with polarization x (say), then we know that photon s has polarization y before hitting the <img src='http://s.wordpress.com/latex.php?latex=%5Clambda%2F4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lambda/4' title='\lambda/4' class='latex' /> plates and the double slit. By looking at [the above formula for <img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle' title='|\Psi\rangle' class='latex' />], it is clear that detection of photon s (after the double slit) with polarization R is compatible only with the passage of s through slit 1 and polarization L is compatible only with the passage of s through slit 2. This can be verified experimentally. In the usual quantum mechanics language, detection of photon p before photon s has prepared photon s in a certain state. [Walborn et al. 2007, p. 4]</p></blockquote>
<p>Firstly, the measurement that p has polarization x after the s photon has traversed the <img src='http://s.wordpress.com/latex.php?latex=%5Clambda%2F4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lambda/4' title='\lambda/4' class='latex' /> plates and two slits [see their Figure 1] does not retrocause the s photon to already have &#8220;polarization y before hitting the <img src='http://s.wordpress.com/latex.php?latex=%5Clambda%2F4&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lambda/4' title='\lambda/4' class='latex' /> plates.&#8221; After photon p is measured with polarization x, then the two particle system is in the superposition state:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=i%7CR%5Crangle_%7Bs1%7D%5Cotimes%7Cx%5Crangle_%7Bp%7D-i%7CL%5Crangle_%7Bs2%7D%5Cotimes%7Cx%5Crangle_%7Bp%7D%3D%5Bi%7CR%5Crangle_%7Bs1%7D-i%7CL%5Crangle_%7Bs2%7D%5D%5Cotimes%7Cx%5Crangle_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i|R\rangle_{s1}\otimes|x\rangle_{p}-i|L\rangle_{s2}\otimes|x\rangle_{p}=[i|R\rangle_{s1}-i|L\rangle_{s2}]\otimes|x\rangle_{p}' title='i|R\rangle_{s1}\otimes|x\rangle_{p}-i|L\rangle_{s2}\otimes|x\rangle_{p}=[i|R\rangle_{s1}-i|L\rangle_{s2}]\otimes|x\rangle_{p}' class='latex' /></p>
<p>which means that the s photon is still in the slit-superposition state: <img src='http://s.wordpress.com/latex.php?latex=i%7CR%5Crangle_%7Bs1%7D-i%7CL%5Crangle_%7Bs2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i|R\rangle_{s1}-i|L\rangle_{s2}' title='i|R\rangle_{s1}-i|L\rangle_{s2}' class='latex' />. Then only with the measurement of the circular polarization states L or R at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> do we have the collapse to (the evolved version of) one of the slit eigenstates s1 or s2 (in their notation). It is an instance of the separation fallacy to infer &#8220;the passage of s through slit 1&#8243; or &#8220;slit 2&#8243;, i.e., s1 or s2, instead of the photon s being in the tagged superposition state <img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle' title='|\Psi\rangle' class='latex' /> after traversing the slits.</p>
<p>Let us take a new polarization space basis of <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle%20%3D%20%2B45%20%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle = +45 ^{\circ}' title='|+\rangle = +45 ^{\circ}' class='latex' /> to the x-axis and <img src='http://s.wordpress.com/latex.php?latex=%7C-%5Crangle%20%3D%20-45%20%5E%7B%5Ccirc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|-\rangle = -45 ^{\circ}' title='|-\rangle = -45 ^{\circ}' class='latex' /> to the x-axis. Then the overall state can be rewritten in terms of this basis as (see original paper for the details):</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle%20%3D%20%5Cfrac%7B1%7D%7B2%7D%5B%28%7C%2B%5Crangle_%7Bs1%7D-i%7C%2B%5Crangle_%7Bs2%7D%29%5Cotimes%20%7C%2B%5Crangle_%7Bp%7D%2B%20i%28%7C-%5Crangle_%7Bs1%7D%2Bi%7C-%5Crangle_%7Bs2%7D%29%5Cotimes%20%7C-%5Crangle_%7Bp%7D%5D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle = \frac{1}{2}[(|+\rangle_{s1}-i|+\rangle_{s2})\otimes |+\rangle_{p}+ i(|-\rangle_{s1}+i|-\rangle_{s2})\otimes |-\rangle_{p}]' title='|\Psi\rangle = \frac{1}{2}[(|+\rangle_{s1}-i|+\rangle_{s2})\otimes |+\rangle_{p}+ i(|-\rangle_{s1}+i|-\rangle_{s2})\otimes |-\rangle_{p}]' class='latex' />.</p>
<p> <a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-polarizer.jpg"><img class="aligncenter size-medium wp-image-64" title="QEraser-polarizer" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-polarizer-286x300.jpg" alt="" width="286" height="300" /></a></p>
<p>Then a <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle' title='|+\rangle' class='latex' /> polarizer or a <img src='http://s.wordpress.com/latex.php?latex=%7C-%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|-\rangle' title='|-\rangle' class='latex' /> polarizer is inserted in front of <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> to select <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle_{p}' title='|+\rangle_{p}' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%7C-%5Crangle_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|-\rangle_{p}' title='|-\rangle_{p}' class='latex' /> respectively. In the first case, this reduces the overall state <img src='http://s.wordpress.com/latex.php?latex=%7C%5CPsi%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\Psi\rangle' title='|\Psi\rangle' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=%28%7C%2B%5Crangle_%7Bs1%7D-i%7C%2B%5Crangle_%7Bs2%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(|+\rangle_{s1}-i|+\rangle_{s2})' title='(|+\rangle_{s1}-i|+\rangle_{s2})' class='latex' />  which exhibits an interference pattern, and similarly for the <img src='http://s.wordpress.com/latex.php?latex=%7C-%5Crangle_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|-\rangle_{p}' title='|-\rangle_{p}' class='latex' /> selection. This is misleadingly said to &#8220;erase&#8221; the so-called &#8220;which-slit information&#8221; so the interference pattern is restored.</p>
<p>The first thing to notice is that two complementary interferences patterns, called &#8220;fringes&#8221; and &#8220;antifringes,&#8221; are being selected. Their sum is the no-interference pattern obtained before inserting the polarizer. The polarizer simply selects one of the interference patterns out of the mush of their merged non-interference pattern. Thus<em> instead</em> of &#8220;erasing which-slit information,&#8221; it selects one of two interference patterns out of the both-patterns mush.</p>
<p>Even though the polarizer may be inserted after the s-photon has traversed the 2 slits, there is no retrocausation of the photon going though both slits or only one slit as previously explained.</p>
<p>One might also notice that the entangled p-photon plays little real role in this non-delayed setup (except to increase the woo-woo factor). Instead of inserting the <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle' title='|+\rangle' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%7C-%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|-\rangle' title='|-\rangle' class='latex' /> polarizer in front of <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' />, insert it in front of <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> and it would have the same effect of selecting <img src='http://s.wordpress.com/latex.php?latex=%28%7C%2B%5Crangle_%7Bs1%7D-i%7C%2B%5Crangle_%7Bs2%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(|+\rangle_{s1}-i|+\rangle_{s2})' title='(|+\rangle_{s1}-i|+\rangle_{s2})' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=%28%7C-%5Crangle_%7Bs1%7D%2Bi%7C-%5Crangle_%7Bs2%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(|-\rangle_{s1}+i|-\rangle_{s2})' title='(|-\rangle_{s1}+i|-\rangle_{s2})' class='latex' />  each of which exhibits interference. Then it is much like the simple quantum eraser of the previous section.</p>
<h2>The delayed quantum eraser</h2>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-delayed.jpg"><img class="aligncenter size-medium wp-image-65" title="QEraser-delayed" src="http://www.mathblog.ellerman.org/wp-content/uploads/2011/11/QEraser-delayed-264x300.jpg" alt="" width="264" height="300" /></a></p>
<p>If the upper arm is extended so the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> detector is triggered last (&#8220;delayed erasure&#8221;), the same results are obtained. The entangled state is collapsed at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />. A coincidence counter (not pictured) is used to correlate the hits at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> with the hits at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> for each fixed polarizer setting, and the same interference pattern is obtained.</p>
<p>The interesting point is that the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> detections could be years after the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> hits in this delayed erasure setup. If the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> polarizer is set at <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle_{p}' title='|+\rangle_{p}' class='latex' />, then out of the mush of hits at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> obtained years before, the coincidence counter will pick out the ones from <img src='http://s.wordpress.com/latex.php?latex=%7C%2B%5Crangle_%7Bs1%7D-i%7C%2B%5Crangle_%7Bs2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|+\rangle_{s1}-i|+\rangle_{s2}' title='|+\rangle_{s1}-i|+\rangle_{s2}' class='latex' /> which will show interference.</p>
<p>Again, the years-later <img src='http://s.wordpress.com/latex.php?latex=D_%7Bp%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{p}' title='D_{p}' class='latex' /> detections do not retro-cause anything at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />, e.g., do not &#8220;erase which-way information&#8221; years after the <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' /> hits are recorded (in spite of the &#8220;delayed erasure&#8221; talk). They only pick (via the coincidence counter) one or the other interference pattern out of the years-earlier mush of hits at <img src='http://s.wordpress.com/latex.php?latex=D_%7Bs%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='D_{s}' title='D_{s}' class='latex' />.</p>
<blockquote><p>&#8220;We must conclude, therefore, that the loss of distinguishability is but a side effect, and that the essential feature of quantum erasure is the post-selection of subensembles with maximal fringe visibility.&#8221; [Kwiat et al. 1999, p. 79]</p></blockquote>
<p>The same sort of analysis could be made of the <a href="http://en.wikipedia.org/wiki/Delayed_choice_quantum_eraser">delayed choice quantum eraser</a> experiment described in the <a href="http://arxiv.org/abs/quant-ph/9903047">paper</a> by Kim et al. &amp; <a href="http://en.wikipedia.org/wiki/Marlan_O._Scully">Scully</a> [Kim 2000].  A good analysis of this experiment which avoids the separation fallacy (and avoids any implication of retro-causality) is given by <a href="http://en.wikipedia.org/wiki/Brian_Greene">Brian Greene</a> of PBS fame [Greene 2004, pp. 194-199].</p>
<h2>References</h2>
<p>Bohm, David 1951.<em> Quantum Theory</em>. Englewood Cliffs NJ: Prentice-Hall.</p>
<p>Dicke, Robert H. and James P. Wittke 1960. <em>Introduction to Quantum Mechanics</em>. Reading MA: Addison-Wesley.</p>
<p>Feynman, Richard P., Robert B. Leighton and Matthew Sands 1965. <em>The Feynman Lectures on Physics: Quantum Mechanics (Vol. III)</em>. Reading MA: Addison-Wesley.</p>
<p>French, A.P. and Edwin F. Taylor 1978. <em>An Introduction to Quantum Physics</em>. New York: Norton.</p>
<p>Greene, Brian 2004.<em> The Fabric of the Cosmos</em>. New York: Alfred A. Knopf.</p>
<p>Hilmer, Rachel and Paul G. Kwiat 2007. A Do-It-Yourself Quantum Eraser. <em>Scientific American</em>. 296 (5 May): 90-95.</p>
<p>Kim, Yoon-Ho, R. Yu, S. P. Kulik, Y. H. Shih and Marlan O. Scully 2000. Delayed choice quantum eraser. <em>Physical Review Letters</em>. 84 (1 ) [quant-ph/9903047].</p>
<p>Kwiat, P. G. , P. D. D. Schwindt and B.-G. Englert 1999. What Does a Quantum Eraser Really Erase? In <em>Mysteries, Puzzles, and Paradoxes in Quantum Mechanics</em>. Rodolfo Bonifacio ed., Woodbury NY: American Institute of Physics: 69-80.</p>
<p>Scully, Marlan O., Berthold-Georg Englert and Herbert Walther 1991. Quantum optical tests of complementarity.<em> Nature</em>. 351 (May 9, 1991): 111-116.</p>
<p>Walborn, S. P., M. O. Terra Cunha, S. Padua and C. H. Monken 2002. Double-slit quantum eraser. <em>Physical Review A.</em> 65 (3).</p>
<p>Wheeler, John A. 1978. The &#8220;Past&#8221; and the &#8220;Delayed-Choice&#8221; Double-Slit Experiment. In <em>Mathematical Foundations of Quantum Theory</em>. A. R. Marlow ed., New York: Academic Press: 9-48.</p>
<p>Wigner, Eugene P. 1979. The Problem of Measurement. In <em>Symmetries and Reflections</em>, Woodbridge CT: Ox Bow Press: 153-170.</p>
<p>Zeilinger, Anton 2004. Why the Quantum? &#8220;It&#8221; from &#8220;bit&#8221;? A participatory universe? Three far-reaching challenges from John Archibald Wheeler and their relation to experiment. In <em>Science and Ultimate Reality: Quantum Theory, Cosmology, and Complexity</em>. John Barrow, Paul Davies and Charles Harper eds., Cambridge: Cambridge University Press: 201-220. Available at: <a href="http://www.quantum.at/zeilinger">http://www.quantum.at/zeilinger</a></p>
<p>Zeilinger, Anton 2008. On the Interpretation and Philosophical Foundation of Quantum Mechanics. In <em>Grenzen menschlicher Existenz</em>. Hans Daub ed., Petersberg: Michael Imhof Verlag. Available at: <a href="http://www.quantum.at/zeilinger">http://www.quantum.at/zeilinger</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2011/11/a-common-qm-fallacy/feed/</wfw:commentRss>
		<slash:comments>23</slash:comments>
		</item>
		<item>
		<title>History of the Logical Entropy Formula</title>
		<link>http://www.mathblog.ellerman.org/2010/02/history-of-the-logical-entropy-formula/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/history-of-the-logical-entropy-formula/#comments</comments>
		<pubDate>Tue, 16 Feb 2010 21:18:16 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Information theory]]></category>
		<category><![CDATA[Partition logic]]></category>
		<category><![CDATA[cryptography]]></category>
		<category><![CDATA[Gini index of diversity]]></category>
		<category><![CDATA[logical entropy]]></category>
		<category><![CDATA[numbers-equivalents]]></category>
		<category><![CDATA[Rao's quadratic entropy]]></category>
		<category><![CDATA[relative shares]]></category>
		<category><![CDATA[Simpson's index of diversity]]></category>
		<category><![CDATA[Turing]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=47</guid>
		<description><![CDATA[The logical entropy formula Given a partition on a finite universe set U, the set of distinctions or dits is the set of ordered pairs of elements in distinct blocks of the partition. The logical entropy of the partition is the normalized cardinality of the dit set: . The logical entropy can be interpreted probabilistically [...]]]></description>
			<content:encoded><![CDATA[<h3>The logical entropy formula <img src='http://s.wordpress.com/latex.php?latex=h%28p%29%20%3D%201-%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p) = 1-\sum_{i}p_{i}^{2}' title='h(p) = 1-\sum_{i}p_{i}^{2}' class='latex' /></h3>
<p>Given a partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB%2CB%27%2C%5Cldots%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B,B&#039;,\ldots\}' title='\pi = \{B,B&#039;,\ldots\}' class='latex' /> on a finite universe set U, the <em>set of distinctions</em> or <em>dits</em> <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi)' title='dit(\pi)' class='latex' /> is the set <img src='http://s.wordpress.com/latex.php?latex=%5Ccup_%7BB%5Cin%20%5Cpi%7DB%20%5Ctimes%20%28U-B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\cup_{B\in \pi}B \times (U-B)' title='\cup_{B\in \pi}B \times (U-B)' class='latex' /> of ordered pairs of elements in distinct blocks of the partition. The <a href="http://www.mathblog.ellerman.org/2010/02/from-partition-logic-to-information-theory/"><em>logical entropy</em></a> <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)' title='h(\pi)' class='latex' /> of the partition is the normalized cardinality of the dit set: <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%20%3D%20%5Cfrac%7B%7Cdit%28%5Cpi%29%7C%7D%7B%7CU%5Ctimes%20U%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi) = \frac{|dit(\pi)|}{|U\times U|}' title='h(\pi) = \frac{|dit(\pi)|}{|U\times U|}' class='latex' />. The logical entropy can be interpreted probabilistically as the probability that the random drawing of a pair of elements (with replacement between draws) from U (with equiprobable elements) gives a distinction of the partition. In terms of the probability <img src='http://s.wordpress.com/latex.php?latex=p_%7BB%7D%20%3D%20%5Cfrac%7B%7CB%7C%7D%7B%7CU%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{B} = \frac{|B|}{|U|}' title='p_{B} = \frac{|B|}{|U|}' class='latex' />, the logical entropy could be computed as:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%20%3D%20%5Csum_%7BB%20%5Cin%20%5Cpi%7Dp_%7BB%7D%281-p_%7BB%7D%29%20%3D%201-%5Csum_%7BB%20%5Cin%20%5Cpi%7Dp_%7BB%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi) = \sum_{B \in \pi}p_{B}(1-p_{B}) = 1-\sum_{B \in \pi}p_{B}^{2}' title='h(\pi) = \sum_{B \in \pi}p_{B}(1-p_{B}) = 1-\sum_{B \in \pi}p_{B}^{2}' class='latex' />.</p>
<p><span id="more-47"></span></p>
<p>The set-complement of the dit set <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi)' title='dit(\pi)' class='latex' /> within the Cartesian product <img src='http://s.wordpress.com/latex.php?latex=U%20%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U \times U' title='U \times U' class='latex' /> is the set <img src='http://s.wordpress.com/latex.php?latex=indit%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='indit(\pi)' title='indit(\pi)' class='latex' /> of <em>indistinctions</em> or <em>indits</em>, i.e., the ordered pairs of elements in the same block of the partition. The complement <img src='http://s.wordpress.com/latex.php?latex=1-h%28%5Cpi%29%3D%20%5Csum_%7BB%20%5Cin%20%5Cpi%7Dp_%7BB%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-h(\pi)= \sum_{B \in \pi}p_{B}^{2}' title='1-h(\pi)= \sum_{B \in \pi}p_{B}^{2}' class='latex' /> of the logical entropy is the normalized count of the indistinctions. It is the probability that a randomly drawn pair will give two elements in the same block.</p>
<p>Replacing each block probability <img src='http://s.wordpress.com/latex.php?latex=p_%7BB%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{B}' title='p_{B}' class='latex' /> by a probability <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> from a finite probability distribution <img src='http://s.wordpress.com/latex.php?latex=p%20%3D%20%5C%7Bp_%7B1%7D%2C%5Cldots%2Cp_%7Bn%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p = \{p_{1},\ldots,p_{n}\}' title='p = \{p_{1},\ldots,p_{n}\}' class='latex' />, the logical entropy <img src='http://s.wordpress.com/latex.php?latex=h%28p%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p)' title='h(p)' class='latex' /> of the finite probability distribution (or finite random variable) is:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28p%29%20%3D%201-%20%5Csum_%7Bi%3D1%7D%5E%7Bn%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p) = 1- \sum_{i=1}^{n}p_{i}^{2}' title='h(p) = 1- \sum_{i=1}^{n}p_{i}^{2}' class='latex' />.</p>
<p>It is the probability that in two independent trials, the random variable will take on different values.</p>
<h3>The relative shares interpretation</h3>
<p>The logical entropy formula makes perfectly good sense when the <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> are interpreted nonprobabilistically as the relative shares in some total where <img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%7Dp_%7Bi%7D%3D%201&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i}p_{i}= 1' title='\sum_{i}p_{i}= 1' class='latex' />. Then it is a measure of the heterogeneity or diversity of with the maximum value being obtained for equal shares.</p>
<p>The complement <img src='http://s.wordpress.com/latex.php?latex=1-h%28p%29%3D%20%5Csum_%7Bi%3D1%7D%5E%7Bn%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-h(p)= \sum_{i=1}^{n}p_{i}^{2}' title='1-h(p)= \sum_{i=1}^{n}p_{i}^{2}' class='latex' /> is the probability that in two independent trials, the random variable will have the same value. If the <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> are interpreted as relative shares, then the complement is a measure of homogeneity or concentration where the maximum value is obtained when one of the <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> is 1 and the others are 0.</p>
<p>According to the late statistician <a href="http://en.wikipedia.org/wiki/I._J._Good">I. J. Good</a>, the formula (in the complementary form) has a certain naturalness: &#8220;If <img src='http://s.wordpress.com/latex.php?latex=p_%7B1%7D%2C%5Cldots%2Cp_%7Bt%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{1},\ldots,p_{t}' title='p_{1},\ldots,p_{t}' class='latex' /> are the probabilities of t mutually exclusive and exhaustive events, any statistician of this century who wanted a measure of homogeneity would take about two seconds to suggest <img src='http://s.wordpress.com/latex.php?latex=%5Csum%20p_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum p_{i}^{2}' title='\sum p_{i}^{2}' class='latex' /> which I shall call <img src='http://s.wordpress.com/latex.php?latex=%5Crho&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho' title='\rho' class='latex' />.&#8221; [Good 1983, p. 561]</p>
<h3>History of the formula</h3>
<p>This logical entropy formula and its complement have a long and varied interdisciplinary history. The formula goes back at least to <a href="http://en.wikipedia.org/wiki/Corrado_Gini">Corrado Gini</a> who suggested <img src='http://s.wordpress.com/latex.php?latex=1-%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-\sum_{i}p_{i}^{2}' title='1-\sum_{i}p_{i}^{2}' class='latex' /> as an index of &#8220;mutability&#8221; or diversity in the same 1912 paper, <em>Variabilità e mutabilità</em>, where he defined his far more famous index of inequality now known as the <a href="http://en.wikipedia.org/wiki/Gini_coefficient">Gini coefficient</a>.</p>
<h4>The formula in cryptography</h4>
<p>But another development of the formula (in the complementary form) in the early twentieth century was in cryptography. The American cryptologist, <a href="http://en.wikipedia.org/wiki/William_Friedman">William F. Friedman</a>, devoted a 1922 book to the &#8220;index of coincidence&#8221; (i.e., <img src='http://s.wordpress.com/latex.php?latex=%5Csum%20p_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum p_{i}^{2}' title='\sum p_{i}^{2}' class='latex' />) which, according to one author on the topic, &#8220;must be regarded as the most important single publication in cryptology.&#8221; [Kahn 1967] Two mathematicians, <a href="http://en.wikipedia.org/wiki/Solomon_Kullback">Solomon Kullback</a> and <a href="http://en.wikipedia.org/wiki/Abraham_Sinkov">Abraham Sinkov</a>, who at one time worked as assistants to Friedman, also wrote books on cryptology which used the index [Kullback 1976; Sinkov 1968].</p>
<p>During World War II, <a href="http://en.wikipedia.org/wiki/Alan_turing">Alan M. Turing</a> worked for a time in the Government Code and Cypher School at the <a href="http://en.wikipedia.org/wiki/Bletchley_Park">Bletchley Park</a> facility in England. Probably unaware of the earlier work, Turing used <img src='http://s.wordpress.com/latex.php?latex=%5Crho%20%3D%20%5Csum_%7Bi%7D%20p_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho = \sum_{i} p_{i}^{2}' title='\rho = \sum_{i} p_{i}^{2}' class='latex' /> in his cryptanalysis work and called it the <em>repeat rate</em> since it is the probability of a repeat in a pair of independent draws from a population with those probabilities (i.e., the identification probability <img src='http://s.wordpress.com/latex.php?latex=1-h%28p%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-h(p)' title='1-h(p)' class='latex' />). Polish cryptanalyists had independently used the repeat rate in their work on the <a href="http://en.wikipedia.org/wiki/Enigma_machine">Enigma machine</a> [Rejewski 1981].</p>
<h4>The formula in biostatistics</h4>
<p>After the war, <a href="http://en.wikipedia.org/wiki/Edward_H._Simpson">Edward H. Simpson</a>, a British statistician, proposed <img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7BB%20%5Cin%20%5Cpi%7Dp_%7BB%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{B \in \pi}p_{B}^{2}' title='\sum_{B \in \pi}p_{B}^{2}' class='latex' /> as a measure of species concentration (the opposite of diversity) where <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is the partition of animals or plants according to species and where each animal or plant is considered as equiprobable. And Simpson gave the interpretation of this homogeneity measure as &#8220;the probability that two individuals chosen at random and independently from the population will be found to belong to the same group.&#8221; [Simpson 1949] Hence <img src='http://s.wordpress.com/latex.php?latex=1-%5Csum_%7BB%20%5Cin%20%5Cpi%7Dp_%7BB%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-\sum_{B \in \pi}p_{B}^{2}' title='1-\sum_{B \in \pi}p_{B}^{2}' class='latex' /> is the probability that a random pair will belong to different species, i.e., will be distinguished by the species partition. In the biodiversity literature [Ricotta and Szeidl 2006], the formula is known as &#8220;<a href="http://en.wikipedia.org/wiki/Index_of_diversity">Simpson&#8217;s index of diversity</a>&#8221; or sometimes, the &#8220;Gini-Simpson diversity index.&#8221;</p>
<p>However, Simpson along with I. J. Good worked at Bletchley during WWII, and, according to Good, &#8220;E. H. Simpson and I both obtained the notion [the repeat rate] from Turing.&#8221; [Good 1979]. When Simpson published the index in 1948, he (again, according to Good) did not acknowledge Turing &#8220;fearing that to acknowledge him would be regarded as a breach of security.&#8221; [Good 1982]</p>
<h4>The formula in economics</h4>
<p>There is at least a third independent development of the formula. In 1945, <a href="http://en.wikipedia.org/wiki/Albert_O._Hirschman">Albert O. Hirschman</a> [1945, 1964] suggested using <img src='http://s.wordpress.com/latex.php?latex=%5Csqrt%7B%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sqrt{\sum_{i}p_{i}^{2}}' title='\sqrt{\sum_{i}p_{i}^{2}}' class='latex' /> as an index of trade concentration (where <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> is the relative share of trade in a certain commodity or with a certain partner). A few years later, Orris Herfindahl [1950] independently suggested using <img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i}p_{i}^{2}' title='\sum_{i}p_{i}^{2}' class='latex' /> as an index of industrial concentration (where <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' /> is the relative share of the <img src='http://s.wordpress.com/latex.php?latex=i%5E%7Bth%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i^{th}' title='i^{th}' class='latex' /> firm in an industry). In the industrial economics literature, the index <img src='http://s.wordpress.com/latex.php?latex=H%20%3D%20%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H = \sum_{i}p_{i}^{2}' title='H = \sum_{i}p_{i}^{2}' class='latex' /> is variously called the <a href="http://en.wikipedia.org/wiki/Herfindahl_index">Hirschman-Herfindahl index</a>, the HH index, or just the H index of concentration. If all the relative shares were equal (i.e., <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D%3D1%2Fn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}=1/n' title='p_{i}=1/n' class='latex' />), then the identification or repeat probability is just the probability of drawing any element, i.e., <img src='http://s.wordpress.com/latex.php?latex=H%3D1%2Fn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H=1/n' title='H=1/n' class='latex' />, so <img src='http://s.wordpress.com/latex.php?latex=1%2FH%20%3D%20n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1/H = n' title='1/H = n' class='latex' /> is the number of equal elements. This led to the &#8220;numbers equivalent&#8221; interpretation of the reciprocal of the H index [Adelman 1969]. The basic idea of the numbers-equivalent was introduced by <a href="http://en.wikipedia.org/wiki/Robert_MacArthur">Robert MacArthur</a> [1965] slightly earlier in the biodiversity literature as a way to interpret the antilog of the Shannon entropy (i.e., the block-count entropy in <a href="http://www.ellerman.org/Davids-Stuff/Maths/Counting-Dits-reprint.pdf">Ellerman 2009</a>).</p>
<h3>The numbers-equivalent interpretation</h3>
<p>In general, given an event with probability <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' />, the <em>numbers-equivalent</em> interpretation of the event is that it is &#8216;as if&#8217; an element was drawn out of a set of <img src='http://s.wordpress.com/latex.php?latex=1%2Fp_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1/p_{i}' title='1/p_{i}' class='latex' /> equiprobable elements (it is &#8216;as if&#8217; since <img src='http://s.wordpress.com/latex.php?latex=1%2Fp_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1/p_{i}' title='1/p_{i}' class='latex' /> need not be an integer). When drawing from a population of n equiprobable elements, the probability of drawing (with replacement) two distinct elements is the logical entropy <img src='http://s.wordpress.com/latex.php?latex=h%28p%29%20%3D%201-%281%2Fn%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p) = 1-(1/n)' title='h(p) = 1-(1/n)' class='latex' /> (i.e., the probability that the second draw is different from the first). The complementary form <img src='http://s.wordpress.com/latex.php?latex=1-h%28p%29%3D%201%2Fn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1-h(p)= 1/n' title='1-h(p)= 1/n' class='latex' /> is the identification probability of drawing the same element twice. Hence in general, given a probability distribution <img src='http://s.wordpress.com/latex.php?latex=p%20%3D%20%5C%7Bp_%7B1%7D%2C%5Cldots%2Cp_%7Bn%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p = \{p_{1},\ldots,p_{n}\}' title='p = \{p_{1},\ldots,p_{n}\}' class='latex' />, the numbers-equivalent interpretation is that it is &#8216;as if&#8217; the random drawing of a pair was from a set of <img src='http://s.wordpress.com/latex.php?latex=n%3D%5Cfrac%7B1%7D%7B1-h%28p%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n=\frac{1}{1-h(p)}' title='n=\frac{1}{1-h(p)}' class='latex' /> equiprobable elements, i.e., such an equiprobable distribution would have the same logical entropy as p. The numbers-equivalent concept shows up again in the interpretation of the (antilog of) Shannon entropy so it is one way to construct the conceptual bridge between the concepts of logical entropy and Shannon entropy.</p>
<h3>Many independent discoveries of the formula</h3>
<p>In view of the multiple independent discoveries of the formula <img src='http://s.wordpress.com/latex.php?latex=%5Crho%20%3D%20%5Csum%20p_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho = \sum p_{i}^{2}' title='\rho = \sum p_{i}^{2}' class='latex' /> or its complement <img src='http://s.wordpress.com/latex.php?latex=1-%20%5Csum%20p_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1- \sum p_{i}^{2}' title='1- \sum p_{i}^{2}' class='latex' /> by Gini, Friedman, Turing, Hirschman, Herfindahl, and no doubt others, I. J. Good wisely advises that &#8220;it is unjust to associate ρ with any one person.&#8221; [Good 1982] The name &#8220;logical entropy&#8221; for <img src='http://s.wordpress.com/latex.php?latex=h%28p%29%20%3D%201-%20%5Csum_%7Bi%3D1%7D%5E%7Bn%7Dp_%7Bi%7D%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p) = 1- \sum_{i=1}^{n}p_{i}^{2}' title='h(p) = 1- \sum_{i=1}^{n}p_{i}^{2}' class='latex' /> not only denotes the basic logical status of the formula, it follows &#8220;Stigler&#8217;s Law of Eponymy&#8221;: &#8220;No scientific discovery is named after its original discoverer.&#8221;[Stigler 1999]</p>
<h3>C.R. Rao&#8217;s quadratic entropy</h3>
<p>From the logical viewpoint, two elements from <img src='http://s.wordpress.com/latex.php?latex=U%20%3D%20%5C%7Bu_%7B1%7D%2C%5Cldots%2Cu_%7Bn%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U = \{u_{1},\ldots,u_{n}\}' title='U = \{u_{1},\ldots,u_{n}\}' class='latex' /> are either identical or distinct. Gini [1912] introduced <img src='http://s.wordpress.com/latex.php?latex=d_%7Bij%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ij}' title='d_{ij}' class='latex' /> as the &#8220;distance&#8221; between the <img src='http://s.wordpress.com/latex.php?latex=i%5E%7Bth%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i^{th}' title='i^{th}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=j%5E%7Bth%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='j^{th}' title='j^{th}' class='latex' /> elements where <img src='http://s.wordpress.com/latex.php?latex=d_%7Bij%7D%3D1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ij}=1' title='d_{ij}=1' class='latex' /> for <img src='http://s.wordpress.com/latex.php?latex=i%20%5Cnot%3D%20j&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i \not= j' title='i \not= j' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=d_%7Bii%7D%3D0&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ii}=0' title='d_{ii}=0' class='latex' />. Since</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=1%3D%20%28p_%7B1%7D%2B%5Cldots%2Bp_%7Bn%7D%29%28p_%7B1%7D%2B%5Cldots%2Bp_%7Bn%7D%29%20%3D%20%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D%2B%20%5Csum_%7Bi%5Cnot%3Dj%7Dp_%7Bi%7Dp_%7Bj%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1= (p_{1}+\ldots+p_{n})(p_{1}+\ldots+p_{n}) = \sum_{i}p_{i}^{2}+ \sum_{i\not=j}p_{i}p_{j}' title='1= (p_{1}+\ldots+p_{n})(p_{1}+\ldots+p_{n}) = \sum_{i}p_{i}^{2}+ \sum_{i\not=j}p_{i}p_{j}' class='latex' />,</p>
<p>the logical entropy, i.e., Gini&#8217;s index of mutability,</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28p%29%3D1-%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D%3D%5Csum_%7Bi%5Cnot%3Dj%7Dp_%7Bi%7Dp_%7Bj%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(p)=1-\sum_{i}p_{i}^{2}=\sum_{i\not=j}p_{i}p_{j}' title='h(p)=1-\sum_{i}p_{i}^{2}=\sum_{i\not=j}p_{i}p_{j}' class='latex' />,</p>
<p>is the <em>average logical distance</em> between a pair of independently drawn elements. But one might generalize by allowing other distances <img src='http://s.wordpress.com/latex.php?latex=d_%7Bij%7D%3Dd_%7Bji%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ij}=d_{ji}' title='d_{ij}=d_{ji}' class='latex' /> for <img src='http://s.wordpress.com/latex.php?latex=i%3Dj&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i=j' title='i=j' class='latex' /> (but always <img src='http://s.wordpress.com/latex.php?latex=d_%7Bii%7D%3D0&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ii}=0' title='d_{ii}=0' class='latex' />) so that <img src='http://s.wordpress.com/latex.php?latex=Q%3D%5Csum_%7Bi%5Cnot%3Dj%7Dd_%7Bij%7Dp_%7Bi%7Dp_%7Bj%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Q=\sum_{i\not=j}d_{ij}p_{i}p_{j}' title='Q=\sum_{i\not=j}d_{ij}p_{i}p_{j}' class='latex' /> would be the <em>average generalized distance</em> between a pair of independently drawn elements from U. In 1982, <a href="http://en.wikipedia.org/wiki/Calyampudi_Radhakrishna_Rao">C. R. (Calyampudi Radhakrishna) Rao</a> introduced this concept as <em>quadratic entropy</em> [1982] (which was later rediscovered in the biodiversity literature as the &#8220;Avalanche Index&#8221; by Ganeshaish et al. [1997]). In many domains, it is quite reasonable to move beyond the bare-bones logical distance of <img src='http://s.wordpress.com/latex.php?latex=d_%7Bij%7D%3D1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='d_{ij}=1' title='d_{ij}=1' class='latex' /> for <img src='http://s.wordpress.com/latex.php?latex=i%20%5Cnot%3D%20j&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='i \not= j' title='i \not= j' class='latex' /> so that Rao&#8217;s quadratic entropy is a useful and easily interpreted generalization of logical entropy.</p>
<h3>Bibliography</h3>
<p>Adelman, M. A. 1969. Comment on the H Concentration Measure as a Numbers-Equivalent. <em>Review of Economics and Statistics.</em> 51: 99-101.</p>
<p>Ellerman, David 2009. Counting Distinctions: On the Conceptual Foundations of Shannon&#8217;s Information Theory. <em>Synthese</em>. 168 (1 (May)): 119-149.</p>
<p>Friedman, William F. 1922. <em>The Index of Coincidence and Its Applications in Cryptography</em>. Geneva IL: Riverbank Laboratories.</p>
<p>Ganeshaiah, K. N., K. Chandrashekara and A. R. V. Kumar 1997. Avalanche Index: A new measure of biodiversity based on biological heterogeneity of communities. <em>Current Science.</em> 73: 128-33.</p>
<p>Gini, Corrado 1912. <em>Variabilità e mutabilità</em>. Bologna: Tipografia di Paolo Cuppini.</p>
<p>Good, I. J. 1979. A.M. Turing&#8217;s statistical work in World War II. <em>Biometrika.</em> 66 (2): 393-6.</p>
<p>Good, I. J. 1982. Comment (on Patil and Taillie: Diversity as a Concept and its Measurement). <em>Journal of the American Statistical Association.</em> 77 (379): 561-3.</p>
<p>Herfindahl, Orris C. 1950. <em>Concentration in the U.S. Steel Industry</em>. Unpublished doctoral dissertation: Columbia University.</p>
<p>Hirschman, Albert O. 1945. <em>National power and the structure of foreign trade</em>. Berkeley: University of California Press.</p>
<p>Hirschman, Albert O. 1964. The Paternity of an Index. <em>American Economic Review.</em> 54 (5): 761-2.</p>
<p>Kahn, David 1967. <em>The Codebreakers: The Story of Secret Writing</em>. New York: Macmillan.</p>
<p>Kullback, Solomon 1976. <em>Statistical Methods in Cryptanalysis</em>. Walnut Creek CA: Aegean Park Press.</p>
<p>MacArthur, Robert H. 1965. Patterns of Species Diversity. <em>Biol. Rev.</em> 40: 510-33.</p>
<p>Rao, C. Radhakrishna 1982. Diversity and Dissimilarity Coefficients: A Unified Approach. <em>Theoretical Population Biology.</em> 21: 24-43.</p>
<p>Rejewski, M. 1981. How Polish Mathematicians Deciphered the Enigma. <em>Annals of the History of Computing.</em> 3: 213-34.</p>
<p>Ricotta, Carlo and Laszlo Szeidl 2006. Towards a unifying approach to diversity measures: Bridging the gap between the Shannon entropy and Rao&#8217;s quadratic index. <em>Theoretical Population Biology.</em> 70: 237-43.</p>
<p>Simpson, Edward Hugh 1949. Measurement of Diversity. <em>Nature.</em> 163: 688.</p>
<p>Sinkov, Abraham 1968. <em>Elementary Cryptanalysis: A Mathematical Approach</em>. New York: Random House.</p>
<p>Stigler, Stephen M. 1999. <em>Statistics on the Table</em>. Cambridge: Harvard University Press.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/history-of-the-logical-entropy-formula/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Arbitrage and Graphical Gridlock</title>
		<link>http://www.mathblog.ellerman.org/2010/02/arbitrage-and-graphical-gridlock/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/arbitrage-and-graphical-gridlock/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 06:02:01 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Math economics]]></category>
		<category><![CDATA[arbitrage]]></category>
		<category><![CDATA[clique formation]]></category>
		<category><![CDATA[gear trains]]></category>
		<category><![CDATA[gridlock]]></category>
		<category><![CDATA[social networks]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=29</guid>
		<description><![CDATA[The arbitrage-free law (or Kirchhoff&#8217;s voltage law) Recently I emailed a friend to complain when his organization used this 3 gear image as their logo. What was my complaint? Read on. The basic idea of arbitrage is to &#8220;get something for nothing&#8221; by trading commodities or currencies around some circle ending up with more than [...]]]></description>
			<content:encoded><![CDATA[<h3>The arbitrage-free law (or Kirchhoff&#8217;s voltage law)</h3>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-gridlock1.jpg"><img class="alignleft size-thumbnail wp-image-41" title="CMRC-gridlock" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-gridlock1-150x150.jpg" alt="" width="150" height="150" /></a></p>
<p>Recently I emailed a friend to complain when his organization used this 3 gear image as their logo. What was my complaint? Read on.</p>
<p>The basic idea of arbitrage is to &#8220;get something for nothing&#8221; by trading commodities or currencies around some circle ending up with more than one started with.  The simplest example would be to have a commodity which can be bought and sold at two different prices.  Then a clear profit is obtained by the arbitrage operation of buying low and selling high.</p>
<p>The circular pattern is that money is transformed into so many units of the commodity at the low price. Say $1 dollar buys 4 apples so each apple is priced at 25¢. Then these units of the commodity are transformed back into money at the high price, say at 50¢ per apple so the 4 apples would sell for $2.  The arbitrage profit is the additional money received back over the money originally paid out to purchase the commodity, in this case $2 – $1 = $1. We assume that no transactions costs are involved with the exchanges.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/buy-low-sell-high1.jpg"><img class="alignleft size-medium wp-image-42" title="buy-low-sell-high" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/buy-low-sell-high1-300x94.jpg" alt="" width="300" height="94" /></a>Such an example of arbitrage profit could not be sustained for long since everyone would want to make the circular trades and get something for nothing. Those who were willing to sell apples at 25¢ would raise their prices and those who were willing to buy apples for 50¢ would offer less until an equilibrium price was reached somewhere between 25¢ and 50¢, say at 40¢. Then we would have a situation called &#8220;arbitrage-free&#8221; where circular trades would just break even (always assuming no transactions costs).</p>
<p><span id="more-29"></span></p>
<p>When profitable arbitrage was possible, then apples in effect had two prices. When the market is arbitrage-free, then the &#8220;law of one price&#8221; holds. Then and only then can prices be assigned to the commodities being traded so that all exchange rates are just price ratios. For instance, when the one price of an apple at 40¢ prevailed, then $1 could be exchanged for 2.5 = 5/2 apples and 1 apple could be exchanged for 2/5 of a dollar.</p>
<p>Another way to characterize the arbitrage-free situation is that if you multiply exchange rates around a circle, then they have to multiple to 1 as in (5/2) x (2/5) = 1 which simply says that if you start with $1 and make the circular exchange, then you have to end with $1. In a market with many commodities (and no transaction costs), then the market is <em>arbitrage-free</em> if for any possible circular exchange (e.g., dollars for apples for oranges for dollars), the product of all the exchange rates has to be 1.</p>
<div id="attachment_43" class="wp-caption alignright" style="width: 206px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/3good-arbitrage-mult1.jpg"><img class="size-full wp-image-43 " title="3good-arbitrage-mult" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/3good-arbitrage-mult1.jpg" alt="" width="196" height="165" /></a><p class="wp-caption-text">(5/2) x (2/3) x (3/5) = </p></div>
<p>Each exchange ratio is the ratio (tail/head) of the prices of the goods. If in going around a circle, an arrow is traversed in the opposite direction, then one uses the reciprocal of its exchange rate. The basic mathematical result about being arbitrage-free is:</p>
<blockquote><p>there exists prices for the goods so that the exchange rates are the price ratios if and only if the product of the exchange rates around any circle is 1.</p></blockquote>
<p>In an early use of mathematical methods in economics, this law was observed by Cournot in 1838 [Cournot, Augustin. 1897 (orig. 1838).  <em>Mathematical Principles of the Theory of Wealth</em> Trans. Nathaniel Bloom. New York: Macmillan]. But the better-known version of the mathematical result was the additive version. To translate from the multiplicative case to the additive case, replace 1 with 0, replace the ratio of two numbers with the difference of the two numbers, and replace multiplying around the circle with adding around the circle. Then the law becomes <em>Kirchhoff&#8217;s voltage law</em> which could be formulated as:</p>
<blockquote><p>there exists potentials at the nodes of an electrical network so that the voltage on the wire between two nodes is the potential difference if and only if the sum of the voltages around any circle is 0.</p></blockquote>
<p>Although the mathematical result is usually attributed to Kirchhoff [Kirchhoff, G. 1847. "Über die Auflosung der Gleichungen, auf welche man dei der Untersuchung der linearen Verteilung galvanischer Strome gefuhrt wird." <em>Annalen der Physik und Chemie</em> 72: 497-508], Cournot had precedence by a few years.</p>
<h3>The arbitrage-free law in social networks</h3>
<p>The arbitrage-free law (or KVL if one prefers) has applications in many areas of science. One amusing application is to the formation of cliques in social networks. Given a set of people, between any two people, assign a +1 for &#8220;like&#8221; or a –1 for &#8220;dislike.&#8221; Then the pattern of likes and dislikes has a coherence, internal consistency, or balance if they are arbitrage-free in the sense that the product around any circle is +1. An example that is not arbitrage-free, i.e., is unbalanced, is the famous mother-in-law triangle (the lines don&#8217;t need a direction in this case since +1 and –1 are equal to their own reciprocal).</p>
<div id="attachment_32" class="wp-caption alignleft" style="width: 200px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Mother-in-law.jpg"><img class="size-full wp-image-32 " title="Mother-in-law" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Mother-in-law.jpg" alt="" width="190" height="130" /></a><p class="wp-caption-text">An &quot;unbalanced&quot; social network</p></div>
<p>The arbitrage-free law then gives the basic result about clique formation: given a set of people with either a &#8220;like&#8221; or &#8220;dislike&#8221; between some pairs, the set can be partitioned into two cliques (e.g., the Hatfields and McCoys, Montagues and Capulets, Serbs and Croatians, Sunnis and Shiites,…) so that all likes are intra-clique and all dislikes are inter-clique if and only if the pattern is balanced (i.e., the product around any circle is +1). In the balanced case, one can assign a +1 to one clique and a –1 to the other clique so that all the +1&#8242;s and –1&#8242;s on the lines between people are the ratios of the &#8220;prices.&#8221; When a pattern of likes and dislikes is unbalanced (i.e., not arbitrage-free), then instead of a good having two prices, some people, like the wife in the mother-in-law triangle (or Romeo and Juliet), in effect belong to two families or two cliques.</p>
<h3>Gridlocked gears</h3>
<p>The &#8220;graphical gridlock&#8221; referred to in the title of this posting refers to trains of intermeshing gears which are all in the same plane (e.g., could all be laid on a flat surface). If one turns one of the gears, that would in general transmit motion through the other gears. But suppose the gear train comes around in a circle? Then the possibility arises of the transmitted motion coming around in a circle to oppose the original motion so that the gear train would be rigid or gridlocked.</p>
<p>One usually has to pay attention to the number of teeth in each gear to calculate the transmitted motion since the revolutions per minute of the different gears would be determined by the gear ratios. But if all the gears were in the same plane, then each gear&#8217;s number of teeth would appear once in the numerator of a gear ratio and once in the denominator so they would always cancel out in absolute value. Thus all that counts is whether the gear is going clockwise or counter-clockwise.</p>
<p>In every gear meshing (not all gears have to mesh), the gears have to have different clockwise motions. If one is clockwise, the other is counter-clockwise and vice-versa. Thus each gear meshing is modeled as two people who &#8220;dislike&#8221; each other. All gear meshings are links with a –1. Then the arbitrage-free law has a very simple form since the product of –1&#8242;s is +1 if and only there are an even number of –1&#8242;s in the product. Hence we have:</p>
<blockquote><p>a circular planar gear train can move if and only if there are an even number of gears in the circle.</p></blockquote>
<p>In that arbitrage-free or balanced case, the two cliques are the &#8220;clockwise&#8221; gears and the &#8220;counter-clockwise&#8221; gears. Or putting the law the other way around,</p>
<blockquote><p>a circular planar gear train has gridlock if and only if there are an odd number of gears.</p></blockquote>
<p>When graphic artists try to represent some type of machinery in a logo or symbol, they seem drawn like moths to a flame to use a circular gear train of three gears all lying in a plane. But since 3 is a odd number, such a gear train is rigid as one can easily see by trying to assign the cliques &#8220;clockwise&#8221; and &#8220;counter-clockwise&#8221; to each gear in a consistent manner (so that each meshing will reverse the clockwise direction).</p>
<h3>Examples of graphical gridlock</h3>
<p>A good friend of mine is the head of a terrific group in Chicago dedicated to the rebirth of manufacturing jobs in the area and in America so it would be natural for them to have a logo with gears. To my surprise, the logo that first appeared on their newsletter was:</p>
<div id="attachment_33" class="wp-caption alignleft" style="width: 172px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-gridlock.jpg"><img class="size-full wp-image-33 " title="CMRC-gridlock" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-gridlock.jpg" alt="" width="162" height="169" /></a><p class="wp-caption-text">Gear Gridlock</p></div>
<p>When I pointed out the &#8220;gridlock&#8221; implications, he said they would certainly change it as soon as they ran through the printing of their stationery.</p>
<p>Over the years, I have collected examples of graphic artists picturing gridlock when they intend to picture smooth working machinery. For instance, <em>The Economist</em> magazine had the following cover to represent the &#8220;global factory.&#8221;</p>
<p>But one doubts they were trying to make the point that it does not  work!</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Global-factory-gears.jpg"><img class="aligncenter size-medium wp-image-34 " title="Global-factory-gears" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Global-factory-gears-221x300.jpg" alt="" width="221" height="300" /></a></p>
<p style="text-align: center;">This global factory will not work.</p>
<p>Or a consulting company might compare their input as applying some  oil to the company&#8217;s machinery.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Need-more-than-oil.jpg"><img class="aligncenter" title="Need-more-than-oil" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Need-more-than-oil.jpg" alt="" width="192" height="188" /></a></p>
<p style="text-align: center;">This machinery needs more than oil.</p>
<p>Or taking a special training course may get one&#8217;s mental gears moving better, but not if they are in the following pattern.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Teaching-co-gears.jpg"><img class="aligncenter size-large wp-image-37  " title="Teaching-co-gears" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Teaching-co-gears-1006x1024.jpg" alt="" width="362" height="368" /></a></p>
<p style="text-align: center;">This training course won&#8217;t help these gears.</p>
<p>Some graphic artists take a picture of real gears.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/real-gearssmall1.jpg"><img class="aligncenter size-full wp-image-39 " title="real-gearssmall" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/real-gearssmall1.jpg" alt="" width="280" height="322" /></a></p>
<p style="text-align: center;">This gear train is only for the photograph.</p>
<p>Some graphic artists seem to finally get it when they put an even number of gears in a circular gear train. But then they are drawn again to the siren song of graphic gridlock by putting in the gear in the upper right corner that once again makes the whole gear train rigid.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/sbc-gears.jpg"><img class="aligncenter size-medium wp-image-40" title="sbc-gears" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/sbc-gears-300x272.jpg" alt="" width="300" height="272" /></a>The mathematical treatment of these ideas and the connection with optimization theory is downloadable <a href="http://www.ellerman.org/Davids-Stuff/Maths/Arbitrage.DOC">here</a>.</p>
<p>Addendum: May 3, 2010<br />
My friend finally fixed his logo so it was no longer rigid.</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-fixed.jpg"><img class="alignleft size-full wp-image-51" title="CMRC-fixed" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/CMRC-fixed.jpg" alt="" width="66" height="109" /></a></p>
<p>&#8211;<br />
Second addendum: Sept. 9, 2010</p>
<p>At last a graphic that uses the 3-gear-mesh gridlock correctly.</p>
<div id="attachment_53" class="wp-caption alignleft" style="width: 310px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Stalledgears.jpg"><img class="size-medium wp-image-53" title="Stalledgears" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Stalledgears-300x224.jpg" alt="" width="300" height="224" /></a><p class="wp-caption-text">Stalled gears from LATimes, Sept. 7, 2010 p. A11</p></div>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/arbitrage-and-graphical-gridlock/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>From Partition Logic to Information Theory</title>
		<link>http://www.mathblog.ellerman.org/2010/02/from-partition-logic-to-information-theory/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/from-partition-logic-to-information-theory/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 21:59:18 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Information theory]]></category>
		<category><![CDATA[Partition logic]]></category>
		<category><![CDATA[Philosophy]]></category>
		<category><![CDATA[logical entropy]]></category>
		<category><![CDATA[probability theory]]></category>
		<category><![CDATA[Shannon]]></category>
		<category><![CDATA[Shannon entropy]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=26</guid>
		<description><![CDATA[A new logic of partitions has been developed that is dual to ordinary logic when the latter is interpreted as the logic of subsets rather than the logic of propositions. For a finite universe, the logic of subsets gave rise to finite probability theory by assigning to each subset its relative cardinality as a Laplacian probability. The analogous development for the dual logic of partitions gives rise to a notion of logical entropy that is related in a precise manner to Claude Shannon's entropy.]]></description>
			<content:encoded><![CDATA[<h3>Some basic analogies between subset logic and partition logic</h3>
<p>A new logic of partitions has been developed that is dual to ordinary logic when the latter is interpreted as the logic of subsets rather than the logic of propositions. For a finite universe, the logic of subsets gave rise to finite probability theory by assigning to each subset its relative cardinality as a Laplacian probability. The analogous development for the dual logic of partitions gives rise to a notion of logical entropy that is related in a precise manner to <a href="http://en.wikipedia.org/wiki/Shannon_entropy">Claude Shannon&#8217;s entropy</a>. In this manner, the new logic of partitions provides a logical-conceptual foundation for information-theoretic entropy or information content. This post continues the earlier <a href="../../../../../2010/02/the-implication-operation-on-partitions-2/">post</a> which introduced some of the basic ideas and operations of partition logic.</p>
<p><span id="more-26"></span></p>
<p>A <em>partition</em> <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB%2CB%27%2C%5Cldots%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B,B&#039;,\ldots\}' title='\pi = \{B,B&#039;,\ldots\}' class='latex' /> on a universe set U (two or more elements to avoid degeneracy) is a set of non-empty subsets (&#8220;blocks&#8221;) <img src='http://s.wordpress.com/latex.php?latex=B%2CB%27%2C%5Cldots%20%5Csubseteq%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B,B&#039;,\ldots \subseteq U' title='B,B&#039;,\ldots \subseteq U' class='latex' /> that are disjoint and jointly exhaust U. A partition may equivalently be viewed as an equivalence relation where the equivalence classes are the blocks. A partition <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%3D%20%5C%7BC%2CC%27%2C%5Cldots%20%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma = \{C,C&#039;,\ldots \}' title='\sigma = \{C,C&#039;,\ldots \}' class='latex' /> is <em>refined</em> by a partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB%2CB%27%2C%5Cldots%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B,B&#039;,\ldots\}' title='\pi = \{B,B&#039;,\ldots\}' class='latex' /> if for every block <img src='http://s.wordpress.com/latex.php?latex=B%20%5Cin%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B \in \pi' title='B \in \pi' class='latex' />, there is a block <img src='http://s.wordpress.com/latex.php?latex=C%20%5Cin%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='C \in \sigma' title='C \in \sigma' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=B%20%5Csubseteq%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B \subseteq C' title='B \subseteq C' class='latex' />. The partitions on U are partially ordered by refinement with the minimum partition or bottom being the <em>indiscrete partition</em> <img src='http://s.wordpress.com/latex.php?latex=0_%7BU%7D%3D%5C%7BU%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='0_{U}=\{U\}' title='0_{U}=\{U\}' class='latex' />, nicknamed the &#8220;blob,&#8221; consisting of U as a single block, and the maximum partition or top being the <em>discrete partition</em> <img src='http://s.wordpress.com/latex.php?latex=1_%7BU%7D%3D%5C%7B%5C%7Bu%5C%7D%5C%7Bu%27%5C%7D%2C%5Cldots%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1_{U}=\{\{u\}\{u&#039;\},\ldots\}' title='1_{U}=\{\{u\}\{u&#039;\},\ldots\}' class='latex' /> where each block is a singleton. Join and meet operations are <a href="../../../../../2010/02/the-implication-operation-on-partitions-2/">easily defined</a> for this partial order so that the partitions on U form a (non-distributive) lattice (NB: in much of the older literature, the &#8220;lattice of partitions&#8221; is written &#8220;upside down&#8221; as the opposite lattice). Then the lattice operations of join and meet can be enriched by other partition operations such as negation, implication, and the (Sheffer) stroke or nand to form a partition algebra.</p>
<p>In the duality between subsets and partitions, outlined in an earlier <a href="../../../../../2010/01/from-propositional-logic-to-subset-logic-to-partition-logic/">post</a>, the dual of an &#8220;element of a subset&#8221; is a &#8220;distinction of a partition&#8221; where an ordered pair <img src='http://s.wordpress.com/latex.php?latex=%28u%2Cu%27%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(u,u&#039;)' title='(u,u&#039;)' class='latex' /> is a <em>distinction</em> or <em>dit</em> of a partition  <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB%2CB%27%2C%5Cldots%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B,B&#039;,\ldots\}' title='\pi = \{B,B&#039;,\ldots\}' class='latex' /> if u and u&#8217; are in distinct blocks. In the  algebra of all partitions on U, the bottom partition <img src='http://s.wordpress.com/latex.php?latex=0_%7BU%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='0_{U}' title='0_{U}' class='latex' /> has no dits and the top <img src='http://s.wordpress.com/latex.php?latex=1_%7BU%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1_{U}' title='1_{U}' class='latex' /> has all dits [i.e., all pairs <img src='http://s.wordpress.com/latex.php?latex=%28u%2Cu%27%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(u,u&#039;)' title='(u,u&#039;)' class='latex' /> where <img src='http://s.wordpress.com/latex.php?latex=u%20%5Cnot%3D%20u%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u \not= u&#039;' title='u \not= u&#039;' class='latex' />] just as in the analogous powerset Boolean algebra on U, the bottom <img src='http://s.wordpress.com/latex.php?latex=%5Cemptyset&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\emptyset' title='\emptyset' class='latex' /> has no elements and the top U has all the elements. Let <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi)' title='dit(\pi)' class='latex' /> be the set of distinctions of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />. The partial order in the BA of subsets is just inclusion of elements and the refinement ordering of partitions is just the inclusion of distinctions, i.e., <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5Cpreceq%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \preceq \pi' title='\sigma \preceq \pi' class='latex' /> iff <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Csigma%29%20%5Csubseteq%20dit%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\sigma) \subseteq dit(\pi)' title='dit(\sigma) \subseteq dit(\pi)' class='latex' />.</p>
<table border="1" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr style="text-align: center;" bgcolor="#ffcc00">
<td valign="top" width="197">Table of analogies</td>
<td valign="top" width="197">Subset concept</td>
<td valign="top" width="197">Partition concept</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">&#8220;Elements&#8221;</td>
<td valign="top" width="197">Elements</td>
<td valign="top" width="197">Distinctions (dits)</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">All &#8220;elements&#8221;</td>
<td valign="top" width="197">Universe U</td>
<td valign="top" width="197">Discrete partition (all dits)</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">No &#8220;elements&#8221;</td>
<td valign="top" width="197">Null set <img src='http://s.wordpress.com/latex.php?latex=%5Cemptyset&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\emptyset' title='\emptyset' class='latex' /></td>
<td valign="top" width="197">Indiscrete partition (no dits)</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Object or &#8220;event&#8221;</td>
<td valign="top" width="197">Subset <img src='http://s.wordpress.com/latex.php?latex=S%5Csubseteq%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S\subseteq U' title='S\subseteq U' class='latex' /></td>
<td valign="top" width="197">Partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> on U</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">&#8220;Event&#8221; occurs</td>
<td valign="top" width="197">Element <img src='http://s.wordpress.com/latex.php?latex=u%20%5Cin%20S&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u \in S' title='u \in S' class='latex' /></td>
<td valign="top" width="197">Dit <img src='http://s.wordpress.com/latex.php?latex=%28u%2Cu%27%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(u,u&#039;)' title='(u,u&#039;)' class='latex' /> distinguished by <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /></td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Partial order</td>
<td valign="top" width="197">Inclusion of elements</td>
<td valign="top" width="197">Inclusion of dits</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Lattice of &#8220;events&#8221;</td>
<td valign="top" width="197">Lattice of all subsets <img src='http://s.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}(U)' title='\mathcal{P}(U)' class='latex' /></td>
<td valign="top" width="197">Lattice of all partitions <img src='http://s.wordpress.com/latex.php?latex=%5CPi%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\Pi(U)' title='\Pi(U)' class='latex' /></td>
</tr>
</tbody>
</table>
<h3>Mimicking the development from subset logic to probability theory</h3>
<p>With these analogies in hand, we then mimic the development of finite probability theory from subset logic (which goes back to Boole) using the corresponding concepts from partition logic.</p>
<p>For a finite U, the finite (Laplacian) <em>probability</em> <img src='http://s.wordpress.com/latex.php?latex=p%28S%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p(S)' title='p(S)' class='latex' /> of a subset (&#8220;event&#8221;) is the ratio: <img src='http://s.wordpress.com/latex.php?latex=p%28S%29%20%3D%20%7CS%7C%2F%7CU%7C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p(S) = |S|/|U|' title='p(S) = |S|/|U|' class='latex' />. Analogously, the finite<em> logical entropy</em> (or <em>logical information content</em>) <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)' title='h(\pi)' class='latex' /> of a partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is the relative size of its dit set compared to the whole &#8220;closure space&#8221; <img src='http://s.wordpress.com/latex.php?latex=U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U\times U' title='U\times U' class='latex' />:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%20%3D%20%5Cfrac%7B%7Cdit%28%5Cpi%29%7C%7D%7B%7CU%5Ctimes%20U%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi) = \frac{|dit(\pi)|}{|U\times U|}' title='h(\pi) = \frac{|dit(\pi)|}{|U\times U|}' class='latex' />.</p>
<p>If U is an urn with each &#8220;ball&#8221; in the urn being equiprobable, then <img src='http://s.wordpress.com/latex.php?latex=p%28S%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p(S)' title='p(S)' class='latex' /> is the probability of an element randomly drawn from the urn being in S, and <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)' title='h(\pi)' class='latex' /> is the probability that a pair of elements randomly drawn from the urn (with replacement) is a distinction of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />.</p>
<p>Let <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB_%7B1%7D%2C%5Cldots%2CB_%7Bn%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B_{1},\ldots,B_{n}\}' title='\pi = \{B_{1},\ldots,B_{n}\}' class='latex' /> with <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D%20%3D%20%7CB_%7Bi%7D%7C%2F%7CU%7C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i} = |B_{i}|/|U|' title='p_{i} = |B_{i}|/|U|' class='latex' /> being the probability of drawing an element of <img src='http://s.wordpress.com/latex.php?latex=B_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B_{i}' title='B_{i}' class='latex' />. The number of indistinctions (non-distinctions) of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is <img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%7D%7CB_%7Bi%7D%7C%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i}|B_{i}|^{2}' title='\sum_{i}|B_{i}|^{2}' class='latex' /> so the number of distinctions is <img src='http://s.wordpress.com/latex.php?latex=%7Cdit%28%5Cpi%29%7C%3D%7CU%7C%5E%7B2%7D-%5Csum_%7Bi%7D%7CB_%7Bi%7D%7C%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|dit(\pi)|=|U|^{2}-\sum_{i}|B_{i}|^{2}' title='|dit(\pi)|=|U|^{2}-\sum_{i}|B_{i}|^{2}' class='latex' /> and thus the logical entropy of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%20%3D%20%5Cfrac%7B%7CU%7C%5E%7B2%7D-%5Csum_%7Bi%7D%7CB_%7Bi%7D%7C%5E%7B2%7D%7D%7B%7CU%7C%5E%7B2%7D%7D%3D1-%5Csum_%7Bi%7Dp_%7Bi%7D%5E%7B2%7D%3D%5Csum_%7Bi%7Dp_%7Bi%7D%281-p_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi) = \frac{|U|^{2}-\sum_{i}|B_{i}|^{2}}{|U|^{2}}=1-\sum_{i}p_{i}^{2}=\sum_{i}p_{i}(1-p_{i})' title='h(\pi) = \frac{|U|^{2}-\sum_{i}|B_{i}|^{2}}{|U|^{2}}=1-\sum_{i}p_{i}^{2}=\sum_{i}p_{i}(1-p_{i})' class='latex' /></p>
<p>since <img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%7Dp_%7Bi%7D%3D1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i}p_{i}=1' title='\sum_{i}p_{i}=1' class='latex' />.</p>
<p>The table of analogies can be continued.</p>
<table border="1" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr style="text-align: center;" bgcolor="#ffcc00">
<td valign="top" width="197">Table of analogies</td>
<td valign="top" width="197">Subset concept</td>
<td valign="top" width="197">Partition concept</td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Counting measure (U finite)</td>
<td valign="top" width="197"># elements in subset S</td>
<td valign="top" width="197"># dits in partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /></td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Normalized count</td>
<td valign="top" width="197"><img src='http://s.wordpress.com/latex.php?latex=P%28S%29%3D%5Cfrac%7B%7CS%7C%7D%7B%7CU%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P(S)=\frac{|S|}{|U|}' title='P(S)=\frac{|S|}{|U|}' class='latex' /></td>
<td valign="top" width="197"><img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%3D%5Cfrac%7B%7Cdit%28%5Cpi%29%7C%7D%7B%7CU%5Ctimes%20U%7C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)=\frac{|dit(\pi)|}{|U\times U|}' title='h(\pi)=\frac{|dit(\pi)|}{|U\times U|}' class='latex' /></td>
</tr>
<tr style="text-align: center;">
<td valign="top" width="197">Probability interpretation</td>
<td valign="top" width="197"><img src='http://s.wordpress.com/latex.php?latex=P%28S%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P(S)' title='P(S)' class='latex' /> = prob. random element in S</td>
<td valign="top" width="197"><img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)' title='h(\pi)' class='latex' /> = prob. random pair distinguished by <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />.</td>
</tr>
</tbody>
</table>
<p>In Shannon&#8217;s information theory, the <em>entropy</em> <img src='http://s.wordpress.com/latex.php?latex=H%28%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(\pi)' title='H(\pi)' class='latex' /> of the partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />  (with the same probabilities assigned to the blocks) is:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=H%28%5Cpi%29%3D%5Csum_%7Bi%7Dp_%7Bi%7Dlog%281%2Fp_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(\pi)=\sum_{i}p_{i}log(1/p_{i})' title='H(\pi)=\sum_{i}p_{i}log(1/p_{i})' class='latex' /></p>
<p>where the log is base 2.</p>
<p>Each entropy can be seen as the probabilistic average of the &#8220;block entropies&#8221; where the logical block entropy is <img src='http://s.wordpress.com/latex.php?latex=h%28B_%7Bi%7D%29%20%3D%201-p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(B_{i}) = 1-p_{i}' title='h(B_{i}) = 1-p_{i}' class='latex' /> and the Shannon block entropy is <img src='http://s.wordpress.com/latex.php?latex=H%28B_%7Bi%7D%29%20%3D%20log%281%2Fp_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(B_{i}) = log(1/p_{i})' title='H(B_{i}) = log(1/p_{i})' class='latex' />. To interpret the block entropies, consider a special case where <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D%20%3D%201%2F2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i} = 1/2^{n}' title='p_{i} = 1/2^{n}' class='latex' /> and every block is the same so there are <img src='http://s.wordpress.com/latex.php?latex=2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{n}' title='2^{n}' class='latex' /> equal blocks like <img src='http://s.wordpress.com/latex.php?latex=B_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B_{i}' title='B_{i}' class='latex' /> in the partition, e.g., the discrete partition on a set with <img src='http://s.wordpress.com/latex.php?latex=2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{n}' title='2^{n}' class='latex' /> elements. The logical entropy of that special equal-block partition is the logical block entropy:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%7Dp_%7Bi%7D%281-p_%7Bi%7D%29%3D%282%5E%7Bn%7D%29p_%7Bi%7D%281-p_%7Bi%7D%29%3D%282%5E%7Bn%7D%291%2F2%5E%7Bn%7D%29%281-p_%7Bi%7D%29%3Dh%28B_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i}p_{i}(1-p_{i})=(2^{n})p_{i}(1-p_{i})=(2^{n})1/2^{n})(1-p_{i})=h(B_{i})' title='\sum_{i}p_{i}(1-p_{i})=(2^{n})p_{i}(1-p_{i})=(2^{n})1/2^{n})(1-p_{i})=h(B_{i})' class='latex' />.</p>
<p>Instead of directly counting the distinctions, we could take the number of binary equal-blocked partitions it takes to distinguish all the <img src='http://s.wordpress.com/latex.php?latex=2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{n}' title='2^{n}' class='latex' /> blocks. As in the game of &#8220;twenty questions,&#8221; if there is a search for an unknown designated block, then each such binary question reduces the number of blocks by a power of 2 so the minimum number of binary partitions it takes to distinguish all the <img src='http://s.wordpress.com/latex.php?latex=2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{n}' title='2^{n}' class='latex' /> blocks is:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=H%28B_%7Bi%7D%29%3Dlog%281%2Fp_%7Bi%7D%29%3Dlog%282%5E%7Bn%7D%29%3Dn&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(B_{i})=log(1/p_{i})=log(2^{n})=n' title='H(B_{i})=log(1/p_{i})=log(2^{n})=n' class='latex' />.</p>
<p>To precisely relate the block entropies, we solve each for <img src='http://s.wordpress.com/latex.php?latex=p_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p_{i}' title='p_{i}' class='latex' />which is then eliminated to obtain:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%28B%29%3D1-%5Cfrac%7B1%7D%7B2%5E%7BH%28B%29%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(B)=1-\frac{1}{2^{H(B)}}' title='h(B)=1-\frac{1}{2^{H(B)}}' class='latex' />.</p>
<h3>An Example</h3>
<p>Consider an example of a set <img src='http://s.wordpress.com/latex.php?latex=U%3D%5C%7B0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U=\{0,1,2,3,4,5,6,7\}' title='U=\{0,1,2,3,4,5,6,7\}' class='latex' />  with <img src='http://s.wordpress.com/latex.php?latex=2%5E%7B3%7D%3D8&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{3}=8' title='2^{3}=8' class='latex' /> elements so that the Shannon entropy of 3 is the least number of binary partitions it takes to distinguish all the elements of the set. The effects of the three partitions can be illustrated in the following squares.</p>
<p style="text-align: center;"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/8x8-grid-3-in-1.jpg"><img class="size-full wp-image-27  aligncenter" title="8x8-grid-3-in-1" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/8x8-grid-3-in-1.jpg" alt="" width="381" height="111" /></a></p>
<p>In terms of the 20 questions game, one could think of the first binary question as asking for the leftmost digit in the binary representation of each number. That gives the first partition, <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7B1%7D%3D%5C%7B%5C%7B0%2C1%2C2%2C3%5C%7D%2C%5C%7B4%2C5%2C6%2C7%5C%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{1}=\{\{0,1,2,3\},\{4,5,6,7\}\}' title='\pi_{1}=\{\{0,1,2,3\},\{4,5,6,7\}\}' class='latex' /> which is represented by the leftmost square. There are 64 squares and the indistinctions or indits of the equipartition of the 8 element set are represented by the 32 shaded squares, while the distinctions or dits of the partition are given by the 32 unshaded squares.</p>
<p>The second binary question asks for the next digit in the binary representation of the number. This yields the second partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7B2%7D%3D%5C%7B%5C%7B0%2C1%2C4%2C5%5C%7D%2C%5C%7B2%2C3%2C6%2C7%5C%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{2}=\{\{0,1,4,5\},\{2,3,6,7\}\}' title='\pi_{2}=\{\{0,1,4,5\},\{2,3,6,7\}\}' class='latex' /> so &#8220;joining&#8221; the information in the two partitions together gives the join <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7B1%7D%5Clor%5Cpi_%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{1}\lor\pi_{2}' title='\pi_{1}\lor\pi_{2}' class='latex' /> represented in the middle square. Then 16 more squares became unshaded, i.e., 16 additional pairs were distinguished by <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{2}' title='\pi_{2}' class='latex' />, for a total of 32 + 16 = 48 dits.</p>
<p>The third binary question asks for the rightmost digit which yields the third partition, <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7B3%7D%3D%5C%7B%5C%7B0%2C2%2C4%2C6%5C%7D.%5C%7B1%2C3%2C5%2C7%5C%7D%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{3}=\{\{0,2,4,6\}.\{1,3,5,7\}\}' title='\pi_{3}=\{\{0,2,4,6\}.\{1,3,5,7\}\}' class='latex' />, which is joined to the other two to create the discrete partition <img src='http://s.wordpress.com/latex.php?latex=1_%7BU%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1_{U}' title='1_{U}' class='latex' /> which distinguishes all elements and is represented by the shaded squares on the diagonal in the rightmost square. This partition adds 8 more unshaded squares, i.e., distinguishes 8 more pairs, for a total of 48 + 8 = 56 dits distinguished by the 3 partitions.</p>
<p>Shannon entropy counts the least number of these partitions it takes to distinguish all the elements, <img src='http://s.wordpress.com/latex.php?latex=H%28%5Cpi_%7B1%7D%5Clor%20%5Cpi_%7B2%7D%20%5Clor%20%5Cpi_%7B3%7D%29%3DH%281_%7BU%7D%29%3D3&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(\pi_{1}\lor \pi_{2} \lor \pi_{3})=H(1_{U})=3' title='H(\pi_{1}\lor \pi_{2} \lor \pi_{3})=H(1_{U})=3' class='latex' />, while the logical entropy counts the number of distinctions which are thereby created, i.e., 56, which normalizes to: <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi_%7B1%7D%5Clor%20%5Cpi_%7B2%7D%20%5Clor%20%5Cpi_%7B3%7D%29%3Dh%281_%7BU%7D%29%3D56%2F64%3D7%2F8&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi_{1}\lor \pi_{2} \lor \pi_{3})=h(1_{U})=56/64=7/8' title='h(\pi_{1}\lor \pi_{2} \lor \pi_{3})=h(1_{U})=56/64=7/8' class='latex' />. In this <img src='http://s.wordpress.com/latex.php?latex=2%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^{n}' title='2^{n}' class='latex' /> example, the two entropies stand in the relationship of the block entropies:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=h%281_%7BU%7D%29%3D%5Cfrac%7B7%7D%7B8%7D%3D1-%5Cfrac%7B1%7D%7B2%5E%7B3%7D%7D%3D1-%5Cfrac%7B1%7D%7B2%5E%7BH%281_%7BU%7D%29%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(1_{U})=\frac{7}{8}=1-\frac{1}{2^{3}}=1-\frac{1}{2^{H(1_{U})}}' title='h(1_{U})=\frac{7}{8}=1-\frac{1}{2^{3}}=1-\frac{1}{2^{H(1_{U})}}' class='latex' />.</p>
<p>The interpretation of the Shannon block entropy is then extended by analogy to the general case where <img src='http://s.wordpress.com/latex.php?latex=1%2Fp_%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1/p_{i}' title='1/p_{i}' class='latex' /> is not a power of 2 so that the Shannon entropy <img src='http://s.wordpress.com/latex.php?latex=H%28%5Cpi%29%3D%5Csum_%7Bi%7Dp_%7Bi%7DH%28B_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='H(\pi)=\sum_{i}p_{i}H(B_{i})' title='H(\pi)=\sum_{i}p_{i}H(B_{i})' class='latex' /> is then interpreted as the average number of binary partitions needed to make all the distinctions between the blocks of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />—whereas the logical entropy is still the relativized count <img src='http://s.wordpress.com/latex.php?latex=h%28%5Cpi%29%3D%5Csum_%7Bi%7Dp_%7Bi%7Dh%28B_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(\pi)=\sum_{i}p_{i}h(B_{i})' title='h(\pi)=\sum_{i}p_{i}h(B_{i})' class='latex' /> of the distinctions created by the partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />.</p>
<h3>Concluding remarks</h3>
<p>Hence the two notions of entropy boil down to two different ways to count the distinctions of a partition. And thus the concept of a distinction from partition logic provides a logical-conceptual basis for the notion of entropy or information content in information theory. Many of the concepts and relations of Shannon&#8217;s information theory, e.g., <a href="http://en.wikipedia.org/wiki/Mutual_information">mutual information</a>, <a href="http://en.wikipedia.org/wiki/Cross_entropy">cross entropy</a>, <a href="http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence">divergence</a>, and the <a href="http://en.wikipedia.org/wiki/Gibbs%27_inequality">information inequality</a>, can then be developed at the logical level in logical information theory. This and much else is spelled out in: Counting Distinctions: On the Conceptual Foundations of Shannon&#8217;s Information Theory. <em>Synthese.</em> 168 (1, May 2009): 119-149, which can be downloaded <a href="http://www.ellerman.org/Davids-Stuff/Maths/Counting-Dits-reprint.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/from-partition-logic-to-information-theory/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Series-Parallel Duality: Part I: Combating series chauvinism</title>
		<link>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-i-combating-series-chauvinism/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-i-combating-series-chauvinism/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 01:50:00 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Math economics]]></category>
		<category><![CDATA[geometric series]]></category>
		<category><![CDATA[MacMahon]]></category>
		<category><![CDATA[reciprocity]]></category>
		<category><![CDATA[series chauvinism]]></category>
		<category><![CDATA[series-parallel duality]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=19</guid>
		<description><![CDATA[This post describes the duality between the usual (series) addition and the dual parallel addition. This duality is normally considered in electrical circuit theory and combinatorics, but it has a much wider applications. In Part I of this post, the focus is on developing series-parallel dual formulas—in contrast to the usual focus on formulas using only the series sum.]]></description>
			<content:encoded><![CDATA[<h3>Series chauvinism</h3>
<p>This post describes the duality between the usual (series) addition and the dual parallel addition. This duality is normally considered in electrical circuit theory and combinatorics, but it has a much wider applications. In Part I of this post, the focus is on developing series-parallel dual formulas—in contrast to the usual focus on formulas using only the series sum.</p>
<p>From the viewpoint of pure mathematics, the parallel sum is &#8220;just as good&#8221; as the series sum.  It is only for empirical and perhaps even some accidental reasons that so much mathematics is developed using the series sum instead of the equally good parallel sum.  There is a whole &#8220;parallel mathematics&#8221; which can be developed with the parallel sum replacing the series sum.  Since the parallel sum can be defined in terms of the series sum (or vice-versa), &#8220;parallel mathematics&#8221; is essentially a new way of looking at certain known parts of mathematics.</p>
<p>Exclusive promotion of the series sum and prejudice against the parallel sum is <em>series chauvinism</em>.  Before venturing further into the parallel universe, we might suggest some exercises to help the politically incorrect reader to combat the heritage of series chauvinism.  Anytime the series sum seems to occur naturally in mathematics with the parallel sum being seemingly invisible, it is an illusion due to series chauvinism.  The parallel sum has a &#8220;parallel&#8221; role that has been unfairly neglected.</p>
<p>In Part II of the post, series-parallel duality is applied to financial arithmetic and is shown to underlie a certain duality of formulas that has long been noted in the literature on valuation and appraisal. Hence this instance of intellectual trespassing applies concepts best known from electrical circuit theory to the operations of financial arithmetic. Moreover in economic theory, the much-used duality of convex functions and their conjugates is the &#8220;integral&#8221; of series-parallel (SP) duality, or, to put it the other way around, SP duality is the &#8220;derivative&#8221; of convex duality.</p>
<p><span id="more-19"></span><br />
Parallel addition</p>
<p>When resistors with resistances a and b are placed in series, their compound resistance is the usual sum (hereafter the <em>series sum</em>) of the resistances a+b.  If the resistances are placed in parallel, their compound resistance is the <em>parallel sum</em> of the resistances, which is denoted by the full colon:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=a%3Ab%3D%28a%5E%7B-1%7D%2Bb%5E%7B-1%7D%29%5E%7B-1%7D%3D%5Cfrac%7B1%7D%7B%5Cfrac%7B1%7D%7Ba%7D%2B%5Cfrac%7B1%7D%7Bb%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='a:b=(a^{-1}+b^{-1})^{-1}=\frac{1}{\frac{1}{a}+\frac{1}{b}}' title='a:b=(a^{-1}+b^{-1})^{-1}=\frac{1}{\frac{1}{a}+\frac{1}{b}}' class='latex' /></p>
<div id="attachment_20" class="wp-caption aligncenter" style="width: 238px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/SP-sums.jpg"><img class="size-full wp-image-20" title="SP-sums" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/SP-sums.jpg" alt="" width="228" height="75" /></a><p class="wp-caption-text">Series and Parallel Sums</p></div>
<p>The parallel sum is associative x:(y:z)) = (x:y):z, commutative x:y = y:x, and distributive x(y:z) = xy:xz.  On the positive reals, there is no identity element for either sum but the &#8220;closed circuit&#8221; 0 and the &#8220;open circuit&#8221; <img src='http://s.wordpress.com/latex.php?latex=%5Cinfty&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\infty' title='\infty' class='latex' /> can be added to form the extended positive reals.  Those elements are the identity elements for the two sums, <img src='http://s.wordpress.com/latex.php?latex=x%2B0%20%3D%20x%20%3D%20x%3A%5Cinfty&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x+0 = x = x:\infty' title='x+0 = x = x:\infty' class='latex' />. That is, adding a short circuit in series to a resistor does not change the resistance, and adding an open circuit in parallel to a resistor does not change the resistance.</p>
<p>As an example of series chauvinism in elementary math, the series sum of fractions is expressed by the annoyingly asymmetrical rule: &#8220;Find the common denominator and then add the numerators.&#8221;  The parallel sum of fractions restores symmetry since it is defined in the dual fashion: &#8220;Find the common numerator and then (series) add the denominators.&#8221;</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Ba%7D%7Bb%7D%3A%5Cfrac%7Ba%7D%7Bd%7D%3D%5Cfrac%7Ba%7D%7Bb%2Bd%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{a}{b}:\frac{a}{d}=\frac{a}{b+d}' title='\frac{a}{b}:\frac{a}{d}=\frac{a}{b+d}' class='latex' /></p>
<p>The usual series sum of fractions can also be obtained by finding the common <em>numerator</em> and then taking the <em>parallel</em> sum of the denominators.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Ba%7D%7Bb%7D%2B%5Cfrac%7Ba%7D%7Bd%7D%3D%5Cfrac%7Ba%7D%7Bb%3Ad%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{a}{b}+\frac{a}{d}=\frac{a}{b:d}' title='\frac{a}{b}+\frac{a}{d}=\frac{a}{b:d}' class='latex' /></p>
<p>The parallel sum of fractions can also be obtained by finding the common denominator and taking the <em>parallel</em> sum of numerators.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Ba%7D%7Bb%7D%3A%5Cfrac%7Bc%7D%7Bb%7D%3D%5Cfrac%7Ba%3Ac%7D%7Bb%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{a}{b}:\frac{c}{b}=\frac{a:c}{b}' title='\frac{a}{b}:\frac{c}{b}=\frac{a:c}{b}' class='latex' /></p>
<p>The rules for series and parallel sums of fractions can be summarized in the following four equations which restore full symmetry.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Ba%7D%7B1%7D%2B%5Cfrac%7Bb%7D%7B1%7D%3D%5Cfrac%7Ba%2Bb%7D%7B1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{a}{1}+\frac{b}{1}=\frac{a+b}{1}' title='\frac{a}{1}+\frac{b}{1}=\frac{a+b}{1}' class='latex' />    <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7Ba%7D%7B1%7D%3A%5Cfrac%7Bb%7D%7B1%7D%3D%5Cfrac%7Ba%3Ab%7D%7B1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{a}{1}:\frac{b}{1}=\frac{a:b}{1}' title='\frac{a}{1}:\frac{b}{1}=\frac{a:b}{1}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7Ba%7D%3A%5Cfrac%7B1%7D%7Bb%7D%3D%5Cfrac%7B1%7D%7Ba%2Bb%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{a}:\frac{1}{b}=\frac{1}{a+b}' title='\frac{1}{a}:\frac{1}{b}=\frac{1}{a+b}' class='latex' />     <img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7Ba%7D%2B%5Cfrac%7B1%7D%7Bb%7D%3D%5Cfrac%7B1%7D%7Ba%3Ab%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{a}+\frac{1}{b}=\frac{1}{a:b}' title='\frac{1}{a}+\frac{1}{b}=\frac{1}{a:b}' class='latex' /></p>
<p>For another example, a series chauvinist might point out that the series sum appears naturally in the rule for working with exponents x<sup>a</sup>x<sup>b</sup> = x<sup>a+b</sup> while the parallel sum does not.  But this is only an illusion due to our mathematically arbitrary symmetry-breaking choice to take exponents to represent powers rather than roots.  Let a pre-superscript stand for a root (just as a post-superscript stands for a power) so <sup>2</sup>x would be the square root of x.  Then the rule for working with <em>these</em> exponents is <sup>a</sup>x<sup>b</sup>x = <sup>a:b</sup>x so the parallel sum does have a role symmetrical to the series sum in the rules for working with exponents.</p>
<h3>Series-Parallel Duality: The Reciprocity Map</h3>
<p>The duality between the series and parallel additions on the positive reals R<sup>+</sup> can be studied by considering the (bijective) reciprocity map</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Crho%20%3AR%5E%7B%2B%7D%5Crightarrow%20R%5E%7B%2B%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho :R^{+}\rightarrow R^{+}' title='\rho :R^{+}\rightarrow R^{+}' class='latex' /> given by <img src='http://s.wordpress.com/latex.php?latex=%5Crho%28x%29%3D1%2Fx&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho(x)=1/x' title='\rho(x)=1/x' class='latex' />.</p>
<p>The reciprocity map preserves the unit <img src='http://s.wordpress.com/latex.php?latex=%5Crho%281%29%3D1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho(1)=1' title='\rho(1)=1' class='latex' />, preserves multiplication <img src='http://s.wordpress.com/latex.php?latex=%5Crho%28xy%29%3D%5Crho%28x%29%5Crho%28y%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho(xy)=\rho(x)\rho(y)' title='\rho(xy)=\rho(x)\rho(y)' class='latex' />, and <em>interchanges</em> the two additions:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Crho%28x%2By%29%3D%5Crho%28x%29%3A%5Crho%28y%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho(x+y)=\rho(x):\rho(y)' title='\rho(x+y)=\rho(x):\rho(y)' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%5Crho%28x%3Ay%29%3D%5Crho%28x%29%2B%5Crho%28y%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\rho(x:y)=\rho(x)+\rho(y)' title='\rho(x:y)=\rho(x)+\rho(y)' class='latex' />.</p>
<p>The reciprocity map captures series-parallel duality on the positive reals.</p>
<p><a href="http://en.wikipedia.org/wiki/Percy_MacMahon">Percy MacMahon</a> called a series connection a &#8220;chain&#8221; and a parallel connection a &#8220;yoke&#8221; (as in ox yoke).  A <em>series-parallel network</em> is constructed solely from chains and yokes (series and parallel connections).  By interchanging the series and parallel connections, each series-parallel network yields a <em>dual</em> or <em>conjugate</em> series-parallel network.  To obtain the dual of an expression such as a+b, apply the reciprocity map to obtain (1/a) : (1/b) but then, for the atomic variables, replace 1/a by a and so forth in the final expression.  Hence the MacMahon dual to a+b  would be a:b, and the dual expression to a+ ((b+c) : d) would be a : ((b : c) + d) (see below).</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/MacMahon-duals.jpg"><img class="aligncenter size-full wp-image-21" title="MacMahon-duals" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/MacMahon-duals.jpg" alt="" width="399" height="185" /></a></p>
<p style="text-align: center;">Conjugate Series-Parallel Networks</p>
<p>If each variable a, b, &#8230; equals one, then the reciprocity map carries each expression for the compound resistance into the conjugate expression.  Hence if all the &#8220;atomic&#8221; resistances are one ohm, a = b = c = d = 1, and the compound resistance of a series-parallel network is R, then the compound resistance of the conjugate network is 1/R [MacMahon 1881, 1892; reprinted in: 1978].  With any positive reals as resistances, MacMahon&#8217;s chain-yoke reciprocity theorem continues to hold if each atomic resistance is also inverted in the conjugate network (i.e., if we just apply the reciprocity map).</p>
<p><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/MacMahon-thm.jpg"><img class="aligncenter size-full wp-image-22" title="MacMahon-thm" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/MacMahon-thm.jpg" alt="" width="330" height="88" /></a></p>
<p style="text-align: center;">MacMahon Chain-Yoke Reciprocity Theorem</p>
<p>The theorem amounts to the observation that the reciprocity map interchanges the two sums while preserving multiplication and unity.  The fundamental intuition is that the series-parallel dual gives reciprocals or multiplicative inverses.</p>
<h3>Dual Equations on the Positive Reals</h3>
<p>Any equation on the positive reals concerning the two sums and multiplication can be dualized by applying the reciprocity map to obtain another equation.  The series sum and parallel sum are interchanged.  For example, the equation</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7B3%7D%285%2B%5Cfrac%7B2%7D%7B5%7D%2B%5Cfrac%7B3%7D%7B5%7D%29%3D2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{3}(5+\frac{2}{5}+\frac{3}{5})=2' title='\frac{1}{3}(5+\frac{2}{5}+\frac{3}{5})=2' class='latex' /></p>
<p>dualizes to the equation</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=3%28%5Cfrac%7B1%7D%7B5%7D%3A%5Cfrac%7B5%7D%7B2%7D%3A%5Cfrac%7B5%7D%7B3%7D%29%3D%5Cfrac%7B1%7D%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='3(\frac{1}{5}:\frac{5}{2}:\frac{5}{3})=\frac{1}{2}' title='3(\frac{1}{5}:\frac{5}{2}:\frac{5}{3})=\frac{1}{2}' class='latex' /></p>
<p>The following equation</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=1%3D%281%2Bx%29%3A%281%2B%5Cfrac%7B1%7D%7Bx%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1=(1+x):(1+\frac{1}{x})' title='1=(1+x):(1+\frac{1}{x})' class='latex' /></p>
<p>holds for any positive real x.  Add any x to one and add its reciprocal to one.  The results are two numbers larger than one and their parallel sum is exactly one.  Dualizing yields the equation</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=1%3D%281%3A%5Cfrac%7B1%7D%7Bx%7D%29%2B%281%3Ax%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1=(1:\frac{1}{x})+(1:x)' title='1=(1:\frac{1}{x})+(1:x)' class='latex' /></p>
<p>for all positive reals x.  Taking the parallel sum of any x and its reciprocal with one yields two numbers smaller than one which sum to one.</p>
<p>For any set of positive reals x<sub>1</sub>,&#8230;,x<sub>n</sub>, the parallel summation can be expressed using the capital P:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bi%3D1%7D%5E%7Bn%7Dx_%7Bi%7D%3D%28%5Csum_%7Bi%3D1%7D%5E%7Bn%7Dx_%7Bi%7D%5E%7B-1%7D%29%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{i=1}^{n}x_{i}=(\sum_{i=1}^{n}x_{i}^{-1})^{-1}' title='P_{i=1}^{n}x_{i}=(\sum_{i=1}^{n}x_{i}^{-1})^{-1}' class='latex' />.</p>
<p style="text-align: center;">Parallel Summation</p>
<h3>Series and Parallel Geometric Series</h3>
<p>The following formula (and its dual) for partial sums of geometric series (starting at i = 1) are useful in financial mathematics (where x is any positive real).</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%281%3Ax%29%5E%7Bi%7D%3D%281%3Ax%29%5Csum_%7Bi%3D0%7D%5E%7Bn-1%7D%281%3Ax%29%5E%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i=1}^{n}(1:x)^{i}=(1:x)\sum_{i=0}^{n-1}(1:x)^{i}' title='\sum_{i=1}^{n}(1:x)^{i}=(1:x)\sum_{i=0}^{n-1}(1:x)^{i}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%281%3Ax%29%5Cfrac%7B1-%281%3Ax%29%5E%7Bn%7D%7D%7B1-%281%3Ax%29%7D%3Dx%281-%281%3Ax%29%5E%7Bn%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='=(1:x)\frac{1-(1:x)^{n}}{1-(1:x)}=x(1-(1:x)^{n})' title='=(1:x)\frac{1-(1:x)^{n}}{1-(1:x)}=x(1-(1:x)^{n})' class='latex' /></p>
<p style="text-align: center;">Partial Sums of Geometric Series</p>
<p>Dualizing (and some algebra) yields a formula for partial sums of the parallel-sum geometric series. The dual of the series subtraction a – b where a &gt; b is the <em>parallel subtraction</em> <img src='http://s.wordpress.com/latex.php?latex=x%5Cominus%20y%3D%28%5Cfrac%7B1%7D%7Bx%7D-%5Cfrac%7B1%7D%7By%7D%29%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x\ominus y=(\frac{1}{x}-\frac{1}{y})^{-1}' title='x\ominus y=(\frac{1}{x}-\frac{1}{y})^{-1}' class='latex' /> where x &lt; y.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bi%3D1%7D%5E%7Bn%7D%281%2Bx%29%5E%7Bi%7D%3D%281%2Bx%29P_%7Bi%3D0%7D%5E%7Bn-1%7D%281%2Bx%29%5E%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{i=1}^{n}(1+x)^{i}=(1+x)P_{i=0}^{n-1}(1+x)^{i}' title='P_{i=1}^{n}(1+x)^{i}=(1+x)P_{i=0}^{n-1}(1+x)^{i}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%281%2Bx%29%5Cfrac%7B1%5Cominus%281%2Bx%29%5E%7Bn%7D%7D%7B1%5Cominus%281%2Bx%29%7D%3D%5Cfrac%7Bx%7D%7B1-%281%2Bx%29%5E%7B-n%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='=(1+x)\frac{1\ominus(1+x)^{n}}{1\ominus(1+x)}=\frac{x}{1-(1+x)^{-n}}' title='=(1+x)\frac{1\ominus(1+x)^{n}}{1\ominus(1+x)}=\frac{x}{1-(1+x)^{-n}}' class='latex' /></p>
<p style="text-align: center;">Partial Sums of Dual Geometric Series</p>
<p>Dualization can also be applied to infinite series.  Taking the limit as <img src='http://s.wordpress.com/latex.php?latex=n%5Crightarrow%20%5Cinfty&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n\rightarrow \infty' title='n\rightarrow \infty' class='latex' /> in the above partial sum formulas yields for any positive reals x the dual summation formulas for series and parallel sum geometric series that begin at the index i = 1.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%3Ax%29%5E%7Bi%7D%3Dx%3DP_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%2Bx%29%5E%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i=1}^{\infty}(1:x)^{i}=x=P_{i=1}^{\infty}(1+x)^{i}' title='\sum_{i=1}^{\infty}(1:x)^{i}=x=P_{i=1}^{\infty}(1+x)^{i}' class='latex' /></p>
<p>The parallel sum series in the above equation can be used to represent a repeating decimal as a fraction.  An example will illustrate the procedure so let z = .367367367… where the &#8220;367&#8243; repeats.  Then since 1/a + 1/b = 1/(a:b), we have:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=z%3D.367367%5Cldots%3D%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%5Cfrac%7B367%7D%7B%281000%29%5E%7Bi%7D%7D%20%3D%5Cfrac%7B367%7D%7BP_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281000%29%5E%7Bi%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='z=.367367\ldots=\sum_{i=1}^{\infty}\frac{367}{(1000)^{i}} =\frac{367}{P_{i=1}^{\infty}(1000)^{i}}' title='z=.367367\ldots=\sum_{i=1}^{\infty}\frac{367}{(1000)^{i}} =\frac{367}{P_{i=1}^{\infty}(1000)^{i}}' class='latex' />.</p>
<p>Taking  y = x+1 for x &gt; 0 in the previous geometric series equation yields</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bi%3D1%7D%5E%7B%5Cinfty%7Dy%5E%7Bi%7D%3Dy-1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{i=1}^{\infty}y^{i}=y-1' title='P_{i=1}^{\infty}y^{i}=y-1' class='latex' /></p>
<p>for y &gt; 1 which is applied to yield</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=z%3D.367367%5Cldots%3D%5Cfrac%7B367%7D%7BP_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281000%29%5E%7Bi%7D%7D%3D%5Cfrac%7B367%7D%7B1000-1%7D%3D%5Cfrac%7B367%7D%7B999%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='z=.367367\ldots=\frac{367}{P_{i=1}^{\infty}(1000)^{i}}=\frac{367}{1000-1}=\frac{367}{999}' title='z=.367367\ldots=\frac{367}{P_{i=1}^{\infty}(1000)^{i}}=\frac{367}{1000-1}=\frac{367}{999}' class='latex' />.</p>
<p>For any positive real x, the beautiful dual formulas for the geometric series with indices beginning at i = 0 can be obtained by serial or parallel adding  <img src='http://s.wordpress.com/latex.php?latex=1%3D%20%281%3Ax%29%5E%7B0%7D%3D%20%281%2Bx%29%5E%7B0%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='1= (1:x)^{0}= (1+x)^{0}' title='1= (1:x)^{0}= (1+x)^{0}' class='latex' /> to each side.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%3D0%7D%5E%7B%5Cinfty%7D%281%3Ax%29%5E%7Bi%7D%3D%281%2Bx%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i=0}^{\infty}(1:x)^{i}=(1+x)' title='\sum_{i=0}^{\infty}(1:x)^{i}=(1+x)' class='latex' /></p>
<p style="text-align: center;">Geometric Series for any Positive Real x</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bi%3D0%7D%5E%7B%5Cinfty%7D%281%2Bx%29%5E%7Bi%7D%3D%281%3Ax%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{i=0}^{\infty}(1+x)^{i}=(1:x)' title='P_{i=0}^{\infty}(1+x)^{i}=(1:x)' class='latex' />.</p>
<p style="text-align: center;">Dual Geometric Series for any Positive Real x</p>
<h3>References</h3>
<p>MacMahon, Percy A. 1881. &#8220;Yoke-Chains and Multipartite Compositions in connexion with the Analytical Forms called &#8216;Trees&#8217; .&#8221;   <em>Proc. London Math. Soc.</em> 22: 330-46.</p>
<p>MacMahon, Percy A. 1892.  &#8220;The Combinations of Resistances.&#8221;  <em>The Electrician</em> 28, 601-2.</p>
<p>MacMahon, Percy A. 1978. <em>Collected Papers: Volume I, Combinatorics</em>. Edited by George E. Andrews.  Cambridge, Mass.: MIT Press.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-i-combating-series-chauvinism/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Series-Parallel Duality: Part II: Finanical arithmetic</title>
		<link>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-ii-finanical-arithmetic/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-ii-finanical-arithmetic/#comments</comments>
		<pubDate>Sat, 06 Feb 2010 01:41:22 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Math economics]]></category>
		<category><![CDATA[amortization payments]]></category>
		<category><![CDATA[financial arithmetic]]></category>
		<category><![CDATA[series-parallel duality]]></category>
		<category><![CDATA[sinking funds]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=23</guid>
		<description><![CDATA[In financial arithmetic and in the appraisal literature, it has been noticed that the basic formulas occur in pairs, one being the reciprocal of the other. This Part II of the series-parallel duality post shows that these reciprocal formulas are an example of the SP duality normally associated with electrical circuit theory.]]></description>
			<content:encoded><![CDATA[<h3>Reciprocal formulas in financial arithmetic</h3>
<p>In financial arithmetic and in the appraisal literature, it has been noticed that the basic formulas occur in pairs, one being the reciprocal of the other.  For instance, one popular text on real estate appraisal presents the &#8220;Basic Functions of Compound Interest and Their Reciprocals&#8221; [Friedman, Jack P. and Nicholas Ordway 1988. <em>Income Property Appraisal and Analysis</em>. Englewood Cliffs: Prentice Hall, p. 70].  The functions could be presented as follows to bring out the underlying symmetry.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr >
<td width="312" valign="top" bgcolor="#ffcc00"><strong>Function</strong></td>
<td width="312" valign="top" bgcolor="#ffcc00"><strong>Reciprocal</strong></td>
</tr>
<tr >
<td width="312" valign="top">Principal Retired by Payment of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%281%2Br%29%5E%7B-n%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  (1+r)^{-n}' title='  (1+r)^{-n}' class='latex' /></td>
<td width="312" valign="top">Payment to Retire Principal of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  (1+r)^{n}' title='  (1+r)^{n}' class='latex' /></td>
</tr>
<tr>
<td width="312" valign="top">Principal Amortized by Payments of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20a%28n%2Cr%29%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7B1%7D%7D%2B%5Cldots%2B%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  a(n,r)=\frac{1}{(1+r)^{1}}+\ldots+\frac{1}{(1+r)^{n}}' title='  a(n,r)=\frac{1}{(1+r)^{1}}+\ldots+\frac{1}{(1+r)^{n}}' class='latex' /></td>
<td width="312" valign="top">Payments to Amortize a Principal of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%5Cfrac%7B1%7D%7Ba%28n%2Cr%29%7D%3D%281%2Br%29%5E%7B1%7D%3A%5Cldots%3A%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  \frac{1}{a(n,r)}=(1+r)^{1}:\ldots:(1+r)^{n}' title='  \frac{1}{a(n,r)}=(1+r)^{1}:\ldots:(1+r)^{n}' class='latex' /></td>
</tr>
<tr>
<td width="312" valign="top">Fund Accumulated by One per Period<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20s%28n%2Cr%29%3D%281%2Br%29%5E%7Bn-1%7D%2B%5Cldots%2B%281%2Br%29%2B1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  s(n,r)=(1+r)^{n-1}+\ldots+(1+r)+1' title='  s(n,r)=(1+r)^{n-1}+\ldots+(1+r)+1' class='latex' /></td>
<td width="312" valign="top">Payments to Accumulate a Fund of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%5Cfrac%7B1%7D%7Bs%28n%2Cr%29%7D%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn-1%7D%7D%3A%5Cldots%3A%5Cfrac%7B1%7D%7B%281%2Br%29%7D%3A1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  \frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\ldots:\frac{1}{(1+r)}:1' title='  \frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\ldots:\frac{1}{(1+r)}:1' class='latex' /></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">The Six Functions of One</p>
<p>This Part II of the series-parallel duality post shows that these reciprocal formulas are an example of the SP duality normally associated with electrical circuit theory.</p>
<h3><span id="more-23"></span>Parallel Sums in Financial Arithmetic</h3>
<p>The parallel sum has a natural interpretation in finance so that each equation and formula in financial arithmetic can be paired with a dual equation or formula.  The parallel sum &#8220;smooths&#8221; balloon payments to yield the constant amortization payment to pay off a loan.  If r is the interest rate per period, then PV(1+r)<sup>n</sup> is the one-shot balloon payment at time n that would pay off a loan with the principal value of PV.  The similar balloon payments that could be paid at times t =1, 2,&#8230;, n, any one of which would pay off the loan, are</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=PV%281%2Br%29%5E%7B1%7D%2CPV%281%2Br%29%5E%7B2%7D%2C%5Cldots%2CPV%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='PV(1+r)^{1},PV(1+r)^{2},\ldots,PV(1+r)^{n}' title='PV(1+r)^{1},PV(1+r)^{2},\ldots,PV(1+r)^{n}' class='latex' />.</p>
<p>But what is the equal amortization payment PMT that would pay off the same loan when paid at each of the times t =1, 2, &#8230;, n?  It is simply the parallel sum of the one-shot balloon payments:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=PMT%3DPV%281%2Br%29%5E%7B1%7D%3APV%281%2Br%29%5E%7B2%7D%3A%5Cldots%3APV%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='PMT=PV(1+r)^{1}:PV(1+r)^{2}:\ldots:PV(1+r)^{n}' title='PMT=PV(1+r)^{1}:PV(1+r)^{2}:\ldots:PV(1+r)^{n}' class='latex' /></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%3D%20P_%7Bi%3D1%7D%5E%7Bn%7DPV%281%2Br%29%5E%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='= P_{i=1}^{n}PV(1+r)^{i}' title='= P_{i=1}^{n}PV(1+r)^{i}' class='latex' />.</p>
<p style="text-align: center;">Amortization Payment is Parallel Sum of Balloon Payments</p>
<p>This use of the parallel sum is not restricted to financial arithmetic.  For example, suppose a forest of initial size PV (in harvestable board feet) grows at the rate r<sub>i</sub> in the i-th period.  Then</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bm%7D%3DPV%5CPi_%7Bi%3D1%7D%5E%7Bm%7D%281%2Br_%7Bi%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{m}=PV\Pi_{i=1}^{m}(1+r_{i})' title='P_{m}=PV\Pi_{i=1}^{m}(1+r_{i})' class='latex' /></p>
<p>would be the <em>one-shot harvest</em> that could be obtained at the end of the m-th period.  For instance,  P<sub>3</sub>, P<sub>17</sub>, and P<sub>23</sub> are the amounts that could be harvested if the whole forest was harvested at the end of the 3<sup>rd</sup>, 17<sup>th</sup>, or the 23<sup>rd</sup> period.  But what is the smooth or equal harvest PMT so that if PMT was harvested at the end of the 3<sup>rd</sup>, 17<sup>th</sup>, and the 23<sup>rd</sup> periods, then the forest would just be completely harvested at end of that last period?  That smooth harvest amount is just the parallel sum of the one-time harvests:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=PMT%3DP_%7B3%7D%3AP_%7B17%7D%3AP_%7B23%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='PMT=P_{3}:P_{17}:P_{23}' title='PMT=P_{3}:P_{17}:P_{23}' class='latex' />.</p>
<p>Returning to financial arithmetic, the discounted present value at time zero of n one dollar payments at the end of periods 1, 2,…, n is a(n,r), the <em>present value of an annuity of one.</em></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=a%28n%2Cr%29%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7B1%7D%7D%2B%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7B2%7D%7D%2B%5Cldots%2B%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='a(n,r)=\frac{1}{(1+r)^{1}}+\frac{1}{(1+r)^{2}}+\ldots+\frac{1}{(1+r)^{n}}' title='a(n,r)=\frac{1}{(1+r)^{1}}+\frac{1}{(1+r)^{2}}+\ldots+\frac{1}{(1+r)^{n}}' class='latex' /></p>
<p style="text-align: center;">Present Value of Payments of One</p>
<p>Dualizing [i.e., applying the reciprocity map from Part I and using the fact that r(1/1+r) = 1+r] yields:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7Ba%28n%2Cr%29%7D%3D%281%2Br%29%5E%7B1%7D%3A%281%2Br%29%5E%7B2%7D%3A%5Cldots%3A%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{a(n,r)}=(1+r)^{1}:(1+r)^{2}:\ldots:(1+r)^{n}' title='\frac{1}{a(n,r)}=(1+r)^{1}:(1+r)^{2}:\ldots:(1+r)^{n}' class='latex' /></p>
<p style="text-align: center;">Payments to Amortize a Principal of One</p>
<p>For the principal value of one dollar at time zero, the one-shot payments at times 1, 2,…, n that would each pay off the principal are the compounded principals (1+r)<sup>1</sup>, (1+r)<sup>2</sup>,…, (1+r)<sup>n</sup>.  The parallel sum (1+r)<sup>1</sup>: (1+r)<sup>2</sup> paid at times 1 and 2 would pay off the $1 principal.  Similarly, the parallel sum of the first three one-shot payments paid at times 1, 2, and 3 would pay off the $1 principal, and so forth.</p>
<p>Suppose the constant interest rate is 20 percent per period.  Then the discounted present value of two amortization payments of 1 at the end of the first and second period is principal value of the loan paid off by those payments, i.e., 55/36:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=a%282%2C.20%29%3D%5Cfrac%7B55%7D%7B36%7D%3D%5Cfrac%7B1%7D%7B%281.20%29%5E%7B1%7D%7D%2B%5Cfrac%7B1%7D%7B%281.20%29%5E%7B2%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='a(2,.20)=\frac{55}{36}=\frac{1}{(1.20)^{1}}+\frac{1}{(1.20)^{2}}' title='a(2,.20)=\frac{55}{36}=\frac{1}{(1.20)^{1}}+\frac{1}{(1.20)^{2}}' class='latex' />.</p>
<p>The equation dualizes to:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7Ba%282%2C.20%29%7D%3D%5Cfrac%7B36%7D%7B55%7D%3D%281.20%29%5E%7B1%7D%3A%281.20%29%5E%7B2%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{a(2,.20)}=\frac{36}{55}=(1.20)^{1}:(1.20)^{2}' title='\frac{1}{a(2,.20)}=\frac{36}{55}=(1.20)^{1}:(1.20)^{2}' class='latex' />.</p>
<p>The amounts (1.2)<sup>1</sup> = 6/5 and (1.2)<sup>2</sup> = 36/25 are the compounded principal values of a $1 loan so they are the one-shot or balloon payments that would pay off a loan of principal value $1 if paid, respectively, at the end of the first or the second period.  Their parallel sum, 36/55, is the equal amortization payment that would pay off that loan of $1 if paid at the end of both the first and second periods.</p>
<p>These facts can be arranged in the following dual format.</p>
<table border="1" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr >
<td width="312" valign="top"><strong>Primal Fact:</strong><br />
The series sum of the discounted amortization payments for a loan is the principal of the loan.</td>
<td width="312" valign="top"><strong>Dual Fact:</strong><br />
The parallel sum of the  compounded principals of a loan is the amortization payment for the loan.</td>
</tr>
</tbody>
</table>
<p>The example illustrates some of the substitutions involved in dualizing the interpretation.</p>
<table border="1" cellspacing="0" cellpadding="0" >
<tbody>
<tr >
<td width="109" valign="top">series sum</td>
<td width="42" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Cleftrightarrow&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\leftrightarrow' title='\leftrightarrow' class='latex' /></td>
<td width="147" valign="top">parallel   sum</td>
</tr>
<tr >
<td width="109" valign="top">discounting</td>
<td width="42" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Cleftrightarrow&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\leftrightarrow' title='\leftrightarrow' class='latex' /></td>
<td width="147" valign="top">compounding</td>
</tr>
<tr>
<td width="109" valign="top">principals</td>
<td width="42" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Cleftrightarrow&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\leftrightarrow' title='\leftrightarrow' class='latex' /></td>
<td width="147" valign="top">payments</td>
</tr>
</tbody>
</table>
<h2>Future Values and Sinking Fund Deposits</h2>
<p>Another staple of financial arithmetic is the computation of sinking fund deposits.  The compounded future value at time n of n one dollar deposits at times 1,2,…, n is s(n,r), the <em>accumulation of one per period.</em></p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=s%28n%2Cr%29%3D%281%2Br%29%5E%7Bn-1%7D%2B%281%2Br%29%5E%7Bn-2%7D%2B%5Cldots%2B%281%2Br%29%2B1%3Da%28n%2Cr%29%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='s(n,r)=(1+r)^{n-1}+(1+r)^{n-2}+\ldots+(1+r)+1=a(n,r)(1+r)^{n}' title='s(n,r)=(1+r)^{n-1}+(1+r)^{n-2}+\ldots+(1+r)+1=a(n,r)(1+r)^{n}' class='latex' /></p>
<p style="text-align: center;">Fund Accumulated by One per Period</p>
<p>The discounted values 1/(1+r)<sup>n-1</sup>,…, 1/(1+r), 1 of a one-dollar fund are the one-shot deposits at times 1,…, n‑1, n that would each by itself yield a one-dollar future value for the sinking fund at time n.  The parallel sum of these one-shot deposits is the (equal) sinking fund deposit at times 1,…, n-1, n that would yield a one-dollar fund at time n:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Cfrac%7B1%7D%7Bs%28n%2Cr%29%7D%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn-1%7D%7D%3A%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn-2%7D%7D%3A%5Cldots%3A%5Cfrac%7B1%7D%7B%281%2Br%29%7D%3A1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\frac{1}{(1+r)^{n-2}}:\ldots:\frac{1}{(1+r)}:1' title='\frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\frac{1}{(1+r)^{n-2}}:\ldots:\frac{1}{(1+r)}:1' class='latex' /></p>
<p style="text-align: center;">Sinking Fund Factor: Payments to Accumulate a Fund of One</p>
<p>The dual interpretations might be stated as follows.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr >
<td width="312" valign="top">The series sum of the n compounded one-dollar deposits is the sinking fund that is accumulated by the one-dollar deposits.</td>
<td width="312" valign="top">The parallel sum of the n discounted one-dollar funds is the deposit that accumulates to a one-dollar sinking fund.</td>
</tr>
</tbody>
</table>
<p>We now have reproduced the six basic functions of the valuation literature as three pairs of series-parallel duals.</p>
<table border="1" cellspacing="0" cellpadding="0" >
<tbody>
<tr>
<td width="312" valign="top" bgcolor="#ffcc00"><strong>Function</strong></td>
<td width="312" valign="top" bgcolor="#ffcc00"><strong>Reciprocal</strong></td>
</tr>
<tr>
<td width="312" valign="top">Principal Retired by Payment of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%281%2Br%29%5E%7B-n%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  (1+r)^{-n}' title='  (1+r)^{-n}' class='latex' /></td>
<td width="312" valign="top">Payment to Retire Principal of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  (1+r)^{n}' title='  (1+r)^{n}' class='latex' /></td>
</tr>
<tr>
<td width="312" valign="top">Principal Amortized by Payments of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20a%28n%2Cr%29%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7B1%7D%7D%2B%5Cldots%2B%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  a(n,r)=\frac{1}{(1+r)^{1}}+\ldots+\frac{1}{(1+r)^{n}}' title='  a(n,r)=\frac{1}{(1+r)^{1}}+\ldots+\frac{1}{(1+r)^{n}}' class='latex' /></td>
<td width="312" valign="top">Payments to Amortize a Principal of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%5Cfrac%7B1%7D%7Ba%28n%2Cr%29%7D%3D%281%2Br%29%5E%7B1%7D%3A%5Cldots%3A%281%2Br%29%5E%7Bn%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  \frac{1}{a(n,r)}=(1+r)^{1}:\ldots:(1+r)^{n}' title='  \frac{1}{a(n,r)}=(1+r)^{1}:\ldots:(1+r)^{n}' class='latex' /></td>
</tr>
<tr>
<td width="312" valign="top">Fund Accumulated by One per Period<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20s%28n%2Cr%29%3D%281%2Br%29%5E%7Bn-1%7D%2B%5Cldots%2B%281%2Br%29%2B1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  s(n,r)=(1+r)^{n-1}+\ldots+(1+r)+1' title='  s(n,r)=(1+r)^{n-1}+\ldots+(1+r)+1' class='latex' /></td>
<td width="312" valign="top">Payments to Accumulate a Fund of One<br />
<img src='http://s.wordpress.com/latex.php?latex=%20%20%5Cfrac%7B1%7D%7Bs%28n%2Cr%29%7D%3D%5Cfrac%7B1%7D%7B%281%2Br%29%5E%7Bn-1%7D%7D%3A%5Cldots%3A%5Cfrac%7B1%7D%7B%281%2Br%29%7D%3A1&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='  \frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\ldots:\frac{1}{(1+r)}:1' title='  \frac{1}{s(n,r)}=\frac{1}{(1+r)^{n-1}}:\ldots:\frac{1}{(1+r)}:1' class='latex' /></td>
</tr>
</tbody>
</table>
<p style="text-align: center;">The Six Functions of One</p>
<h3>Infinite Streams of Payments</h3>
<p>The formulas for amortization payments can be extended to an infinite time horizon.  This involves a financial interpretation for the dual geometric series from Part I with indices beginning at i = 1:</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%3Ax%29%5E%7Bi%7D%3Dx%3DP_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%2Bx%29%5E%7Bi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i=1}^{\infty}(1:x)^{i}=x=P_{i=1}^{\infty}(1+x)^{i}' title='\sum_{i=1}^{\infty}(1:x)^{i}=x=P_{i=1}^{\infty}(1+x)^{i}' class='latex' />.</p>
<p>Taking x = 1/r so that 1:x = 1:1/r = 1/(1+r) in the series summation yields the fact that the discounted present value of the constant stream of one-dollar payments at times 1, 2,… is reciprocal of the interest rate x = 1/r.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=%5Csum_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%2Br%29%5E%7B-i%7D%3D%5Cfrac%7B1%7D%7Br%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sum_{i=1}^{\infty}(1+r)^{-i}=\frac{1}{r}' title='\sum_{i=1}^{\infty}(1+r)^{-i}=\frac{1}{r}' class='latex' />.</p>
<p style="text-align: center;">Perpetuity Capitalization Formula</p>
<p>Taking x = r in the parallel summation yields the fact that the parallel sum of compounded values of one dollar is the interest rate r, the constant payment at t = 1, 2,… that pays off a principal value of one dollar.</p>
<p style="text-align: center;"><img src='http://s.wordpress.com/latex.php?latex=P_%7Bi%3D1%7D%5E%7B%5Cinfty%7D%281%2Br%29%5E%7Bi%7D%3Dr&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_{i=1}^{\infty}(1+r)^{i}=r' title='P_{i=1}^{\infty}(1+r)^{i}=r' class='latex' /></p>
<p style="text-align: center;">Dual of Perpetuity Capitalization Formula</p>
<p>Thus the dual to the annuity capitalization formula (1/r is the principal whose payments are 1) is the fact that the constant income stream of r is the equivalent of the capital of $1 (r is the payments whose principal is 1).</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="312" valign="top">The series sum of the stream of discounted $1 amortization payments (which is the principal amortized by a $1 amortization payment) is the reciprocal of the interest rate,<br />
<img src='http://s.wordpress.com/latex.php?latex=%281%2Br%29%5E%7B-1%7D%2B%281%2Br%29%5E%7B-2%7D%2B%5Cldots%3Dr%5E%7B-1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1+r)^{-1}+(1+r)^{-2}+\ldots=r^{-1}' title='(1+r)^{-1}+(1+r)^{-2}+\ldots=r^{-1}' class='latex' />.</td>
<td width="312" valign="top">The parallel sum of the stream of compounded $1 principals (which is the payment that amortizes a $1 principal) is the interest rate,</p>
<p><img src='http://s.wordpress.com/latex.php?latex=%281%2Br%29%5E%7B1%7D%3A%281%2Br%29%5E%7B%202%7D%3A%5Cldots%3Dr&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(1+r)^{1}:(1+r)^{ 2}:\ldots=r' title='(1+r)^{1}:(1+r)^{ 2}:\ldots=r' class='latex' />.</td>
</tr>
</tbody>
</table>
<p>More material on series-parallel duality can be found in Chapter 12 of my 1995 book, <em>Intellectual Trespassing as a Way of Life: Essays in Philosophy, Economics, and Mathematics</em>, (Rowman &amp; Littlefield) or in a paper that can be downloaded <a href="http://www.ellerman.org/Davids-Stuff/Maths/sp_math.doc">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/series-parallel-duality-part-ii-finanical-arithmetic/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Math of Double-Entry Bookkeeping: Part I (scalars)</title>
		<link>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-i-scalars/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-i-scalars/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 00:35:01 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Math economics]]></category>
		<category><![CDATA[double-entry bookkeeping]]></category>
		<category><![CDATA[Kotruljevic]]></category>
		<category><![CDATA[Pacioli]]></category>
		<category><![CDATA[single-entry accounting]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=15</guid>
		<description><![CDATA[Double-entry bookkeeping illustrates one of the most astonishing examples of intellectual insulation between disciplines—the very opposite of intellectual trespassing.]]></description>
			<content:encoded><![CDATA[<h3>Mathematics and Accounting: Two disjoint universes?</h3>
<p>Double-entry bookkeeping illustrates one of the most astonishing examples of intellectual insulation between disciplines—the very opposite of intellectual trespassing.  Double-entry bookkeeping (DEB) was developed during the fifteenth century and was published in 1494 as a system by the Italian mathematician <a href="http://en.wikipedia.org/wiki/Pacioli">Luca Pacioli</a> and was anticipated in a 1458 manuscript of the Croatian merchant <a href="http://en.wikipedia.org/wiki/Benedikt_Kotruljevi%C4%87">Benedikt Kotruljević</a> (that was only published in 1573). Double-entry book­keeping has been used for over five centuries in commercial accounting systems.  If the mathematical formulation of any field should be well understood, one would think it might be accounting.  Remarkably, however, the mathematical formulation of <a href="http://en.wikipedia.org/wiki/Double-entry_bookkeeping_system">double entry accounting</a>—algebraic operations on ordered pairs of numbers—was first published only in 1982 and is still largely unknown both in mathematics and accounting.</p>
<p><span id="more-15"></span></p>
<p>The mathematical basis for a precise treatment of DEB was developed in the nineteenth century by <a href="http://en.wikipedia.org/wiki/William_Rowan_Hamilton">William Rowan Hamilton</a> (1837) as an abstract mathematical construction using ordered pairs or <a href="http://www.maths.tcd.ie/pub/HistMath/People/Hamilton/PureTime/">couples</a> to deal with the complex numbers.  The multiplicative version of this construction is the &#8220;group of fractions&#8221; which uses ordered pairs of whole numbers (written vertically) to enlarge the system of positive whole numbers to the system of positive fractions containing multiplicative inverses (just reverse the entries in a fraction to get its inverse).  The ordered pairs construction that is relevant to conventional DEB is the additive case called the &#8220;group of differences.&#8221;  It is used to construct a number system with &#8220;additive inverses&#8221; by using operations on ordered pairs of positive numbers including zero (unsigned numbers), and it is part of undergraduate (if not freshman) abstract algebra.</p>
<p>All that is required to grasp the connection with the group of differences and DEB is to make the identification:</p>
<p style="text-align: center;">ordered pairs (horizontally written) of numbers in group of differences construction</p>
<p style="text-align: center;">= two-sided T-accounts of DEB (debits on the left side and credits on the right side).</p>
<p>In view of this identification, the ordinary group of differences (or fractions in the multiplicative case) might be called the <em>Pacioli group</em>—which would be distinguished from the modern generalization from the 1950s called the <a href="http://en.wikipedia.org/wiki/Grothendieck_group">Grothendieck group</a>.</p>
<p>In spite of some attention to DEB by mathematicians such as DeMorgan (1869), Cayley (1894), and Kemeny, Schleifer, Snell, and Thompson (1962), this connection has not been noted in mathematics (not to mention in accounting) with one perhaps solitary exception. In a semi-popular book, <a href="http://en.wikipedia.org/wiki/Dudley_E._Littlewood">D. E. Littlewood</a> (NB: not the <a href="http://en.wikipedia.org/wiki/John_Edensor_Littlewood">J.E. Littlewood</a> of Hardy-and-Littlewood fame who was D. E. Littlewood&#8217;s tutor at Trinity) noted the connection:</p>
<blockquote><p>The bank associates two totals with each customer&#8217;s account, the total of moneys credited and the total of moneys withdrawn.  The net balance is then regarded as the same if, for example, the credit amounts of £102 and the debit £100, as if the credit were £52 and the debit £50.  If the debit exceeds the credit the balance is negative.</p>
<p>This model is adopted in the definition of signed integers.  Consider pairs of cardinal numbers (a, b) in which the first number corresponds to the debit, and the second to the credit.  [<em>Skeleton Key of Mathematics</em>,1960, p. 18]</p></blockquote>
<p>With this exception (which was not further developed), I have not been able to find a single mathematics—not to mention accounting—book or paper, elementary or advanced, popular or esoteric, which notes that the ordered pairs of the group of differences construction are the T-accounts used in the business world for about five centuries.  Mathematics and accounting truly seem to live in disjoint universes with no trespassing between them.</p>
<p>Is the Pacioli group the correct formulation of DEB? One acid test of a mathematical formulation of a theory is the question of whether or not it facilitates the generalization of the theory.  Normal bookkeeping does not deal with incommensurate physical quantities; everything is expressed in the common units of money.  The accounting textbooks still treat &#8220;common units&#8221; as a necessary condition for accounting. But the mathematical formulation of DEB effortlessly generalizes to multi-dimensional accounting using vectors of incommensurate quantities (see Part II of this post).</p>
<h3>The Pacioli Group</h3>
<p>Multi-dimensional accounting is based on the group of differences or Pacioli group construction starting with non-negative vectors.  The usual case of  accounting can be identified with the special case using one dimensional vectors or scalars.  In this Part I, the focus is on the scalar case. The ordered pairs of non-negative scalars are the usual <em>T-accounts.</em> The left-hand side (LHS) vector d is the debit entry and the right-hand side (RHS) vector c is the credit entry.</p>
<p style="text-align: center;">T-account: [ d // c ] = [ debit vector // credit vector ].</p>
<p>The algebraic operations on T-accounts are much like the operations on fractions except that addition is substituted for multiplication.  In order to illustrate the additive-multiplicative analogy between T-accounts and fractions, the basic definitions will be developed in parallel columns. For a fraction or &#8220;multiplicative T-account&#8221; using non-negative integers, we may take the numerator as the debit entry and the denominator as the credit entry.</p>
<table border="1" cellspacing="0" cellpadding="0" align="center">
<thead>
<tr align="center">
<td width="103" valign="top"><strong><br />
</strong></td>
<td width="264" valign="top"><strong>Additive Case</strong></td>
<td width="271" valign="top"><strong>Multiplicative Case</strong></td>
</tr>
</thead>
<tbody>
<tr align="center">
<td width="103" valign="top">Operation on T-accounts</td>
<td width="264" valign="top">T-accounts add together by adding debits to debits   and credits to credits</p>
<p>[ w // x ] + [ y // z ] = [ w + y // x + z ].</td>
<td width="271" valign="top">Fractions multiply together by multiplying numerator   times numerator and denominator times denominator</p>
<p>(w/x)(y/z) = (wy/xz).</td>
</tr>
<tr align="center">
<td width="103" valign="top">Identity element for   operation</td>
<td width="264" valign="top">The identity element for addition is the zero   T-account [ 0 // 0 ].</td>
<td width="271" valign="top">The identity element for multiplication is the unit fraction   (1/1).</td>
</tr>
<tr align="center">
<td width="103" valign="top">Equality between two   T-accounts.</td>
<td width="264" valign="top">Given two T-accounts   [ w // x ] and [ y // z ], the <em>cross-sums</em> are the two vectors   obtained by adding the credit entry in one T-account to the debit entry in   the other T-account.  The equivalence   relation between T-accounts is defined by setting two T-accounts <em>equal</em> if their cross-sums are equal:</p>
<p>[ w // x ] = [ y // z   ]  if w + z = x + y.</td>
<td width="271" valign="top">Given two fractions   (w/x) and (y/z), the <em>cross-multiples</em> are the two integers obtained by multiplying the numerator of one with the denominator   of the other.  The equivalence relation   between fractions is defined by setting two fractions <em>equal</em> if their cross-multiples are equal:</p>
<p>(w/x) = (y/z) if wz =   xy.</td>
</tr>
<tr align="center">
<td width="103" valign="top">Inverses</td>
<td width="264" valign="top">The negative or additive inverse of a T-account is   obtained by reversing the debit and credit entries:</p>
<p>– [ w // x] = [ x // w ].</td>
<td width="271" valign="top">The multiplicative inverse of a fraction is obtained   by reversing the numerator and denominator:</p>
<p>(w/x)–1 =   (x/w).</td>
</tr>
<tr align="center">
<td width="103" valign="top">&#8220;Disjointness&#8221;   of T-accounts</td>
<td width="264" valign="top">Two non-negative   scalars x and y are said to be <em>disjoint</em> if min(x,y) = 0.</td>
<td width="271" valign="top">Two integers w and x   are said to be <em>relatively prime</em> if   gcd(w,x) = 1.</td>
</tr>
<tr align="center">
<td width="103" valign="top">&#8220;Reduced form&#8221;   for a T-account</td>
<td width="264" valign="top">A T-account [x // y] is in <em>reduced form</em> if x and y are disjoint.</td>
<td width="271" valign="top">A fraction (w/x) is in <em>lowest terms</em> if w and x are relatively prime.</td>
</tr>
<tr align="center">
<td width="103" valign="top">Unique reduced form   representation</td>
<td width="264" valign="top">Every T-account [x // y] has a unique   reduced representation</p>
<p>[x–min(x,y) // y–min(x,y)].</td>
<td width="271" valign="top">Each fraction (w/x) has a unique representation in   lowest terms</p>
<p>(w/gcd(w,x)) / (x/gcd(w,x)).</td>
</tr>
<tr align="center">
<td width="103" valign="top">Example</td>
<td width="264" valign="top">Consider the T-account [12 // 10]. The minimum is min(12,10) = 10 so the reduced form   is: [2 // 0].</td>
<td width="271" valign="top">Consider the fraction 28/35. The greatest common divisor is gcd(28,35) = 7 so the   fraction in lowest terms is 4/5.</td>
</tr>
</tbody>
</table>
<p>The <em>Pacioli group</em> P consists of the ordered pairs [x // y] of non-negative scalars, with the above definition of addition and equality.  The Pacioli group P is isomorphic with the set of real numbers <strong>R</strong> under two isomorphisms: the <em>debit isomorphism</em>, which associates [w // x] with w–x, and the <em>credit isomorphism</em>, which associates [w // x] with x–w.  In order to translate from T-accounts  [x // y] back and forth to general signed reals z, one needs to specify whether to use the debit or credit isomorphism.  This will be done by labeling the T-account as <em>debit balance</em> or <em>credit balance</em>.  Thus if a T-account  [x // y] is debit balance, the corresponding scalar is x–y, and if it is credit balance, then the corresponding scalar is y–x.</p>
<h3>Double-entry versus &#8220;single-entry&#8221; accounting</h3>
<p>Given an equation w + &#8230; + x = y + &#8230; + z, it is not possible to change just one term in the equation and have it still hold.  Two or more terms must be changed.  The fact that two or more terms (or &#8220;accounts&#8221;) must be changed is <em>not</em> the basis for the double-entry method (in spite of every accounting book that I have checked saying that).  That mathematical fact is a characteristic of the transaction itself (the changes in the equation), not a characteristic of the method of recording the transaction.  The double-entry method is a method of encoding an equation using ordered pairs or T-accounts and using unsigned numbers (non-negative numbers) to record transactions to make changes in the equation.  While there is unfortunately considerable confusion about this in the accounting literature, the doubleness of &#8220;double-entry&#8221; is the two-sidedness of the T-accounts and the mathematical properties that follow (e.g., equal debits and credits in a transaction, and equal debits and credits in the trial balance of the whole set of accounts or ledger).</p>
<p>The alternative to the double entry method is to record a transaction by making a single entry of adding a <em>signed</em> (positive or negative) number to each affected account.  Two or more accounts in the equation would still always be affected by this alternative method of recording a transaction (since that is a property of the transaction itself, not of the recording method).  Such a system is a complete accounting system to update the balance sheet equation but would have no two-sided T-accounts, no debits or credits, no double entry principle (equal debits and credits in a transaction), and no trial balance of adding debits and credits.</p>
<p>Unfortunately, the phrase &#8220;<a href="http://en.wikipedia.org/wiki/Single-entry_accounting_system">single entry accounting</a>&#8221; is normally used to denote simply an incomplete accounting &#8220;system&#8221; (e.g., no equity account) where there is no equation to be updated.  But without an equation, that is not an alternative &#8220;system&#8221; at all.  The real choice between the double entry method and the complete single entry method of recording a transaction is the choice between using unsigned (&#8220;single-sided&#8221;) numbers in two-sided accounts (DEB) or signed (&#8220;two-sided&#8221;) numbers in &#8220;single-sided&#8221; accounts (complete single-entry accounting).</p>
<h3>Example of double-entry accounting</h3>
<p>Consider an example of a company with the simplified initial balance sheet equation:</p>
<table border="0" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr>
<td width="97" valign="top">Assets</td>
<td width="36" valign="top">=</td>
<td width="128" valign="top">Liabilities</td>
<td width="40" valign="top">+</td>
<td width="128" valign="top">Equity</td>
</tr>
<tr>
<td width="97" valign="top">15,000</td>
<td width="36" valign="top">=</td>
<td width="128" valign="top">10,000</td>
<td width="40" valign="top">+</td>
<td width="128" valign="top">5,000</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Beginning Scalar Balance Sheet</p>
<p>It is customary in accounting to move each term or &#8220;account&#8221; to the side of the equation so that it is preceded by a plus sign.  A T-account equal to the zero T-account [0 // 0] is called a<strong> </strong><em>zero-account</em>.  Equations encode as zero-accounts.  Each left-hand side (LHS) term x is encoded as a debit-balance T-account [x // 0] and each right-hand side (RHS) term y is encoded as a credit-balance T-account [0 // y].  These T-accounts then would add up to the zero-account [0 // 0].  The balance sheet equation thus encodes as an equation zero-account which, by leaving out the plus signs, becomes the following initial ledger of T-accounts.</p>
<table border="0" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr>
<td width="97" valign="top">Assets</td>
<td width="36" valign="top"></td>
<td width="128" valign="top">Liabilities</td>
<td width="40" valign="top"></td>
<td width="128" valign="top">Equity</td>
</tr>
<tr>
<td width="97" valign="top">[15000 // 0]</td>
<td width="36" valign="top"></td>
<td width="128" valign="top">[0 // 10000]</td>
<td width="40" valign="top"></td>
<td width="128" valign="top">[0 // 5000]</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Beginning Ledger of T-Accounts</p>
<p>Consider three transactions in a productive firm.</p>
<p>1.         $1200 of input inventories are used up and charged directly to equity.<br />
2.         $1500 of product is produced, sold, and added directly to equity.<br />
3.         $800 principal payment is made on a loan.</p>
<p>Each transaction is then encoded as a transactional zero-account and added to the appropriate terms of the equational zero-account (&#8220;posting the journal to the ledger&#8221;).  For instance, the first transaction subtracts 1200 from Assets and subtracts 1200 from Equity.  The Assets account is encoded as a LHS or debit-balance account so subtracting a number from it would be encoded as adding the T-account [0 // 1200] to it.  Equity is encoded as a RHS or credit-balance term so subtracting 1200 from it would be encoded as adding [1200 // 0] to it.  The other transactions are encoded in a similar manner.</p>
<table border="0" cellspacing="0" cellpadding="0" width="589" align="center">
<tbody>
<tr>
<td width="229" valign="top"></td>
<td width="120" valign="top">Assets</td>
<td width="114" valign="top">Liabilities</td>
<td width="126" valign="top">Equity</td>
</tr>
<tr>
<td width="229" valign="top">Original   equation zero-account:</td>
<td width="120" valign="top">[15000   // 0]</td>
<td width="114" valign="top">[0   // 10000]</td>
<td width="126" valign="top">[0   // 5000]</td>
</tr>
<tr>
<td width="229" valign="top">+   Transaction 1 zero-account:</td>
<td width="120" valign="top">[0   // 1200]</td>
<td width="114" valign="top"></td>
<td width="126" valign="top">[1200 // 0]</td>
</tr>
<tr>
<td width="229" valign="top">+   Transaction 2 zero-account:</td>
<td width="120" valign="top">[1500   // 0]</td>
<td width="114" valign="top"></td>
<td width="126" valign="top">[0   // 1500]</td>
</tr>
<tr>
<td width="229" valign="top">+   Transaction 3 zero-account:</td>
<td width="120" valign="top">[0   // 800]</td>
<td width="114" valign="top">[800   // 0]</td>
<td width="126" valign="top"></td>
</tr>
<tr>
<td width="229" valign="top">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</td>
<td width="120" valign="top">&#8212;&#8212;&#8212;&#8212;</td>
<td width="114" valign="top">&#8212;&#8212;&#8212;&#8212;</td>
<td width="126" valign="top">&#8212;&#8212;&#8212;&#8212;</td>
</tr>
<tr>
<td width="229" valign="top">=   Ending equation zero-account:</td>
<td width="120" valign="top">[16500   // 2000]</td>
<td width="114" valign="top">[800   // 10000]</td>
<td width="126" valign="top">[1200   // 6500]</td>
</tr>
<tr>
<td width="229" valign="top">=   (in reduced form)</td>
<td width="120" valign="top">[14500   // 0]</td>
<td width="114" valign="top">[0   // 9200]</td>
<td width="126" valign="top">[0   // 5300]</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Initial Ledger + Journal = Ending Ledger</p>
<p>The initial T-accounts in the ledger add up to the zero account (initial trial balance).  Each transaction is encoded as two or more T-accounts that add to the zero-account (double entry principle).  Zero added to zero equals zero.  Thus adding the transaction zero-accounts to the initial equation zero-account (posting journal to ledger) will yield another equation zero-account (which can be checked by taking another trial balance).  Each T-account is then decoded according to how whether it was encoded as debit balance or credit balance to obtain the ending balance sheet equation.</p>
<table border="0" cellspacing="0" cellpadding="0" align="center">
<tbody>
<tr>
<td width="97" valign="top">Assets</td>
<td width="36" valign="top">=</td>
<td width="128" valign="top">Liabilities</td>
<td width="40" valign="top">+</td>
<td width="128" valign="top">Equity</td>
</tr>
<tr>
<td width="97" valign="top">14,500</td>
<td width="36" valign="top">=</td>
<td width="128" valign="top">9,200</td>
<td width="40" valign="top">+</td>
<td width="128" valign="top">5,300</td>
</tr>
</tbody>
</table>
<p style="text-align: center;">Ending Balance Sheet Equation</p>
<p>A vector example will be given in Part II.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-i-scalars/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The Math of Double-Entry Bookkeeping: Part II (vectors)</title>
		<link>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-ii-vectors/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-ii-vectors/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 00:06:14 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Applications]]></category>
		<category><![CDATA[Math economics]]></category>
		<category><![CDATA[double-entry bookkeeping]]></category>
		<category><![CDATA[multi-dimensional accounting]]></category>
		<category><![CDATA[Pacioli]]></category>
		<category><![CDATA[vectors]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=14</guid>
		<description><![CDATA[Although double-entry bookkeeping (DEB) has been used in the business world for 5 centuries, the mathematical formulation of the double entry method is almost completely unknown. ]]></description>
			<content:encoded><![CDATA[<h3>Multi-dimensional double-entry accounting?</h3>
<p>Although double-entry bookkeeping (DEB) has been used in the business world for 5 centuries, the mathematical formulation of the double entry method is almost completely unknown. In this post, the mathematical treatment of double-entry bookkeeping using scalars given in Part I is generalized to the multi-dimensional case using vectors.  The success in maintaining the two-sided accounts, debits and credits, the double-entry principle, and the trial balance in both cases shows that the formulation captures the double-entry method in mathematical form.</p>
<p><span id="more-14"></span></p>
<p>One acid test of a mathematical formulation of a theory is the question of whether or not it facilitates the generalization of the theory.  Normal bookkeeping does not deal with incommensurate physical quantities; everything is expressed in the common units of money.  Hence the question has previously arisen of a generalization of DEB to deal with multi-dimensional incommensurates with no common measure of value.</p>
<p>In the literature on the &#8220;mathematics&#8221; of accounting, there is a proposed &#8220;solution&#8221; to this question, a system of multi-dimensional physical accounting published by <a href="http://en.wikipedia.org/wiki/Yuji_Ijiri">Yuji Ijiri</a> at least three times in 1965, 1966, and 1967.  In this system, most of the normal structure of DEB was lost:</p>
<p>• there was no balance sheet equation,<br />
• there were no equity or proprietorship accounts,<br />
• the temporary or nominal accounts could not be closed, and<br />
• the &#8220;trial balance&#8221; did not balance.</p>
<p>It is common for certain aspects of a theory to be lost in a generalization of the theory.</p>
<blockquote><p>For instance, the convenient idea of an accounting identity is lost since the dimensional and metric comparability it assumes is no longer present except under special circumstances. [Ijiri 1967, 333]</p></blockquote>
<p>The accounting community has apparently accepted the failure of all these features of DEB as the necessary price to be paid to generalize DEB to incommensurate physical quantities. The other papers on &#8220;Double-entry multidimensional accounting&#8221; published in the accounting literature [e.g., Charnes et al. 1976, or Haseman and Whinston 1976] acquiesced in the absence of the balance-sheet equation.</p>
<p>Yet when DEB is mathematically formulated using the group of differences from undergraduate algebra, then the generalization to vectors of incommensurate physical quantities is immediate and trivial.  <em>All</em> of the normal features of DEB—such as the balance-sheet equation, the equity account, the temporary accounts, and the trial balance—are preserved in the generalization [see Ellerman 1982, 1985, 1986, 2009].  Thus the model of multidimensional DEB &#8220;accepted&#8221; in the accounting community was simply a failed attempt at generalization which had been &#8220;received&#8221; as a successful generalization that unfortunately had to &#8220;sacrifice&#8221; certain features of DEB.</p>
<p>Due to the remarkable intellectual insulation between mathematics and accounting, the successful mathematical treatment and generalization of double-entry bookkeeping (first published over a quarter-century ago in 1982) will take many more years to become known and understood in the accounting literature.</p>
<h3>The double-entry method: general vector case</h3>
<p>The <em>Pacioli group</em> P<sup>n</sup> consists of the ordered pairs [x // y] of non-negative n-dimensional vectors, with the usual definitions of componentwise addition and equality (equality of cross-sums).  The Pacioli group P<sup>n</sup> is isomorphic with all of <strong>R</strong><sup>n</sup> (the set of all real n-vectors with positive and negative components) under two isomorphisms: the <em>debit isomorphism</em>, which associates [w // x] with w–x, and the <em>credit isomorphism</em>, which associates [w // x] with x–w.  In order to translate a T-account  [x // y] back and forth to a general vector z, one needs to specify whether to use the debit or credit isomorphism.  This will be done by labeling the T-account as <em>debit balance</em> (DB) or <em>credit balance</em> (CB).  Thus if a T-account  [x // y] is debit balance, the corresponding vector is x–y, and if it is credit balance, then the corresponding vector is y–x.</p>
<p>The general case of the double-entry method starts with an equation between sums of n-dimensional vectors.  Vector equations are first <em>encoded</em> in the Pacioli group constructed from the non-negative n-dimensional vectors.  Since the vectors in a T-account must be non-negative, we must first develop a way to separate out the positive and negative components of a vector.  The<strong> </strong><em>positive part </em>of a vector x is x<sup>+</sup> = max(x,0), the componentwise maximum of x and the zero vector [note that "0" is used, depending on the context, to refer to the zero scalar or the zero vector].  The <em>negative part </em>of x is  x<sup>–</sup> = –min(x,0), the componentwise negative of the minimum of x and the zero vector.  Both the positive and negative parts of a vector x are non-negative vectors.  Every vector x has a &#8220;Jordan decomposition&#8221; x = x<sup>+</sup> – x<sup>–</sup>.  The two isomorphisms that map vectors to T-accounts of non-negative vectors are the debit isomorphism that maps x to the T-account [x<sup>+</sup> // x<sup>–</sup>] and the credit isomorphism that maps x to [x<sup>–</sup> // x<sup>+</sup>]. A T-account of non-negative vectors is in reduced form if it is in reduced form componentwise.</p>
<p>Given any vector equation in <strong>R</strong><sup>n</sup>, w + &#8230; + x = y + &#8230; + z, each left-hand side (LHS) vector x is encoded via the debit isomorphism as a debit-balance T-account [x<sup>+</sup> // x<sup>–</sup>] and each right-hand side (RHS) vector y is encoded via the credit isomorphism as a credit-balance T-account [y<sup>–</sup> // y<sup>+</sup>].  Then the original equation holds if and only the sum of the encoded T-accounts is a zero-account:</p>
<p style="text-align: center;">w + &#8230; + x = y + &#8230; + z<br />
if and only if<br />
[w<sup>+</sup> // w<sup>–</sup>] + &#8230; + [x<sup>+</sup> // x<sup>–</sup>] + [y<sup>–</sup> // y<sup>+</sup>] + &#8230; + [z<sup>–</sup> // z<sup>+</sup>]</p>
<p>is a zero-account.</p>
<p>Given the equation, the sum of the encoded T-accounts is the <em>equation zero-account</em> of the equation.  Since only plus signs can appear between the T-accounts in an equational zero-account, the plus signs can be left implicit.  The listing of the T-accounts in an equational zero-account (without the plus signs) is the <em>ledger</em>.</p>
<p>Changes in the various terms or &#8220;accounts&#8221; in the beginning equation are recorded as <em>transactions</em>.  Transactions must be recorded as valid algebraic operations which transform equations into equations.  Since equations encode as zero-accounts, a valid algebraic operation would transform zero-accounts into zero-accounts.  There is only one such operation in the Pacioli group: add on a zero-account.  Zero plus zero equals zero.  The zero-accounts representing transactions are called <em>transaction zero-accounts</em>.  The listing of the transactional zero-accounts is the <em>journal</em>.</p>
<p>A series of valid additive operations on a vector equation can then be presented in the following standard scheme:</p>
<p>Beginning Equation Zero-Account<br />
+    <span style="text-decoration: underline;">Transaction Zero-Accounts</span><br />
=    Ending Equation Zero-Account</p>
<p>or, in more conventional terminology,</p>
<p>Beginning Ledger<br />
+ <span style="text-decoration: underline;">Journal</span><br />
= Ending Ledger.</p>
<p>The process of adding the transaction zero-accounts to the initial ledger to obtain the ledger at the end of the accounting period is called <em>posting the journal to the ledger</em>.  The fact that a transaction zero-account is equal to [0 // 0] is traditionally expressed as the <em>double-entry principle</em> that transactions are recorded with equal debits and credits.  The summing of the debit and credit sides of what should be an equation zero-account to check that it is indeed a zero-account is traditionally called the <em>trial balance</em>.  <em>All</em> those features from scalar case of DEB carry over effortlessly to the general vector case.</p>
<p>At the end of the cycle, the ending equational zero-account is decoded to obtain the equation that results from the algebraic operations represented in the transactions.  The T-accounts in an equational zero-account can be arbitrarily partitioned into two sets: DB (debit balance) and CB (credit balance).  T-accounts [w // x] in DB are decoded as w–x on the left side of the equation, and T-accounts [w // x] in CB are decoded as x–w on the right side of the equation.  Given a zero-account, this algorithm yields an equation.  In an accounting application, the T-accounts in the final equation zero-account would be partitioned into sets DB and CB according to the side of the initial equation from which they were encoded.</p>
<h3>Simple example of double-entry vector accounting</h3>
<p>Consider the following initial vector equation:</p>
<p style="text-align: center;">(6, –3, 10) + (–2, 5, –2) = (4, 2, 8).<br />
Sample Vector Equation to be Encoded</p>
<p>It encodes as the equation zero-account (taking the LHS vectors as DB accounts and the RHS vectors as CB accounts):</p>
<p style="text-align: center;">[(6, 0, 10) // (0, 3, 0)] + [(0, 5, 0) // (2, 0, 2)] + [(0, 0, 0) // (4, 2, 8)].<br />
<em>Equation Encoded as a Zero T-Account</em></p>
<p>Suppose that the transaction would subtract the vector (–2, –9, 1) from the first vector on the LHS and from the vector on the RHS side of the original equation to obtain the ending equation:</p>
<p style="text-align: center;">(8, 6, 9) + (–2, 5, –2) = (6, 11, 7).<br />
<em>Ending Vector Equation</em></p>
<p>To perform this operation using the double-entry method, the subtracting of the vector (–2, –9, 1) from the first LHS term is encoded using the credit isomorphism to get [(2,9,0) // (0,0,1)] which is added to the first LHS or debit-balance term in the T-account version of the original equation.  In more traditional terminology, we would say that (–2, –9, 1) is &#8220;credited&#8221; to that debit-balance account.  For the subtraction from the RHS term, the vector is encoded using the debit isomorphism to obtain [(0,0,1) // (2,9,0)] and added to the credit-balance T-account version of the RHS term.  That is, (–2, –9, 1) is &#8220;debited&#8221; to that credit-balance account.</p>
<p>In the scalar case, a T-account will always have a reduced form either as [d // 0] or [0 // c] so that adding [d // 0] to an account (a term in the equational zero-account) can be described as &#8220;debiting d to the account&#8221; and similarly for &#8220;crediting c to the account.&#8221;  For vector T-accounts, the reduced form of a T-account does not necessarily have the zero vector on one side or the other.  In this case, the reduced form of the T-account encoding of  (–2, –9, 1) would be &#8220;mixed.&#8221;  The &#8220;debit&#8221; takes the form of adding the T-account [(0,0,1) // (2,9,0)] obtained by applying the debit isomorphism to the term (–2, –9, 1), and the &#8220;credit&#8221; takes the form of adding the inverse [(2,9,0) // (0,0,1)] obtained by applying the credit isomorphism to the term. This yields another equational zero-account:</p>
<table border="1" cellspacing="0" cellpadding="0" width="647" align="center">
<tbody>
<tr>
<td width="209" valign="top">Original   Eq. zero-account:</td>
<td width="148" valign="top">[(6,0,10)   // (0,3,0)]</td>
<td width="142" valign="top">[(0,5,0)   // (2,0,2)]</td>
<td width="148" valign="top">[(0,0,   0) // (4,2,8)]</td>
</tr>
<tr>
<td width="209" valign="top">+   Transaction zero-account:</td>
<td width="148" valign="top">[(2,9,0)  // (0,0,1)]</td>
<td width="142" valign="top"></td>
<td width="148" valign="top">[(0,0,1)  // (2,9,0)]</td>
</tr>
<tr>
<td width="209" valign="top">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</td>
<td width="148" valign="top">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</td>
<td width="142" valign="top">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</td>
<td width="148" valign="top">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</td>
</tr>
<tr>
<td width="209" valign="top">=   Ending eq. zero-account</td>
<td width="148" valign="top">[(8,9,10)  // (0,3,1)]</td>
<td width="142" valign="top">[(0,5,0)  // (2,0,2)]</td>
<td width="148" valign="top">[(0,0,1)  // (6,11,8)]</td>
</tr>
<tr>
<td width="209" valign="top">=   (reduced form)</td>
<td width="148" valign="top">[(8,6,9)  // (0,0,0)]</td>
<td width="142" valign="top">[(0,5,0)  // (2,0,2)]</td>
<td width="148" valign="top">[(0,0,0)  // (6,11,7)]</td>
</tr>
</tbody>
</table>
<p style="text-align: center;"><em>Beginning Ledger + Journal = Ending Ledger</em></p>
<p>After a number of such transactions, the ending equation zero-account is then decoded to obtain an equation back in <strong>R</strong><sup>n</sup>.  In this case, let the first two T-accounts be debit-balance and the third one credit-balance (as they were originally encoded).  Then the ending equational zero-account decodes as the correct vector equation:</p>
<p style="text-align: center;">(8, 6, 9) + (–2, 5, –2) = (6, 11, 7).<br />
Decoded Ending Equation</p>
<h3>References</h3>
<p>Charnes, A., C. Colantoni and W. W. Cooper. 1976.  A futurological justification for historical cost and multidimensional accounting.  <em>Accounting, Organizations, and Society </em>1, no. 4: 315-37.</p>
<p>Ellerman, David. 1982.  <a href="http://www.ellerman.org/Davids-Stuff/Books/EAPT.CV.pdf"><em>Economics, Accounting, and Property Theory</em></a>.  Lexington, Mass.: D. C. Heath.</p>
<p>Ellerman, David. 1985.  <a href="http://www.ellerman.org/Davids-Stuff/Maths/DEB-Math-Mag.CV.pdf">The Mathematics of Double Entry Bookkeeping</a>.  <em>Mathematics Magazine</em>. 58 (September): 226-33.</p>
<p>Ellerman, David. 1986.  <a href="http://www.ellerman.org/Davids-Stuff/Maths/Omega-DEB.CV.pdf">Double Entry Multidimensional Accounting</a>.  <em>Omega, International Journal of Management Science</em> 14, no. 1: 13-22.</p>
<p>Ellerman, David 2009. <a href="http://www.ellerman.org/Davids-Stuff/Maths/FSR-Forum-DEB.pdf">Double-Entry Accounting: The Mathematical Formulation and Generalization</a>. <em>FSR Forum (Financial Studies Association Rotterdam).</em> February: 17-22.</p>
<p>Haseman, W., and A. Whinston. 1976.  Design of a multidimensional accounting system.  <em>Accounting Review</em> 51, no. 1: 65-79.</p>
<p>Ijiri, Y. 1965.  <em>Management Goals and Accounting for Control</em>.  Amsterdam: North-Holland.</p>
<p>Ijiri, Y. 1966.  Physical Measures and Multi-dimensional Accounting.  In <em>Research in Accounting Measurement</em>. ed. R. K. Jaedicke, Y. Ijiri, and O. Nielsen, 150-64. Sarasota Fla.: American Accounting Association,</p>
<p>Ijiri, Y. 1967.  <em>The Foundations of Accounting Measurement: A Mathematical, Economic, and Behavioural Inquiry</em>.  Englewood Cliffs, N.J.: Prentice-Hall.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/the-math-of-double-entry-bookkeeping-part-ii-vectors/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The implication operation on partitions</title>
		<link>http://www.mathblog.ellerman.org/2010/02/the-implication-operation-on-partitions-2/</link>
		<comments>http://www.mathblog.ellerman.org/2010/02/the-implication-operation-on-partitions-2/#comments</comments>
		<pubDate>Tue, 02 Feb 2010 20:20:34 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Math logic]]></category>
		<category><![CDATA[Partition logic]]></category>
		<category><![CDATA[equivalence relations]]></category>
		<category><![CDATA[Gian-Carlo Rota]]></category>
		<category><![CDATA[implication]]></category>
		<category><![CDATA[partitions]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=10</guid>
		<description><![CDATA[In a 2001 commemorative volume for my mathematical mentor, Gian-Carlo Rota, three of his associates noted that "the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join $latex \lor$ and meet $latex \land$ operations." This note defines the apparently new operation of implication for partitions, an operation that was key to the development of the logic of partitions that is dual to the usual logic of subsets.]]></description>
			<content:encoded><![CDATA[<h3>Partitions and equivalence relations</h3>
<p>In a 2001 commemorative volume for my mathematical mentor, Gian-Carlo Rota, three of his associates noted that &#8220;the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join <img src='http://s.wordpress.com/latex.php?latex=%5Clor&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lor' title='\lor' class='latex' /> and meet <img src='http://s.wordpress.com/latex.php?latex=%5Cland&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\land' title='\land' class='latex' /> operations.&#8221; This note defines the apparently new operation of implication for partitions, an operation that was key to the development of the logic of partitions that is dual to the usual logic of subsets.</p>
<p><span id="more-10"></span></p>
<p>A <em>partition</em> <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%3D%20%5C%7BB%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi = \{B\}' title='\pi = \{B\}' class='latex' /> on a set U (two or more elements) is a set of disjoint subsets <img src='http://s.wordpress.com/latex.php?latex=B%2CB%27%2C%5Cldots%5Csubseteq%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B,B&#039;,\ldots\subseteq U' title='B,B&#039;,\ldots\subseteq U' class='latex' />, called the <em>blocks</em> of the partition, whose union is U. The notion of a partition on a set is equivalent to the notion of an <em>equivalence relation</em> <img src='http://s.wordpress.com/latex.php?latex=R&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R' title='R' class='latex' /> on U which is a binary relation on U, i.e., a subset of the product <img src='http://s.wordpress.com/latex.php?latex=U%20%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U \times U' title='U \times U' class='latex' />, that is reflexive (<img src='http://s.wordpress.com/latex.php?latex=uRu&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='uRu' title='uRu' class='latex' /> for all <img src='http://s.wordpress.com/latex.php?latex=u%20%5Cin%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u \in U' title='u \in U' class='latex' />), symmetric (if <img src='http://s.wordpress.com/latex.php?latex=uRu%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='uRu&#039;' title='uRu&#039;' class='latex' /> then <img src='http://s.wordpress.com/latex.php?latex=u%27Ru&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u&#039;Ru' title='u&#039;Ru' class='latex' /> for all <img src='http://s.wordpress.com/latex.php?latex=u%2Cu%27%20%5Cin%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u,u&#039; \in U' title='u,u&#039; \in U' class='latex' />), and transitive (if <img src='http://s.wordpress.com/latex.php?latex=uRu%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='uRu&#039;' title='uRu&#039;' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=u%27Ru%27%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u&#039;Ru&#039;&#039;' title='u&#039;Ru&#039;&#039;' class='latex' /> then <img src='http://s.wordpress.com/latex.php?latex=uRu%27%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='uRu&#039;&#039;' title='uRu&#039;&#039;' class='latex' /> for any <img src='http://s.wordpress.com/latex.php?latex=u%2Cu%27%2Cu%27%27%20%5Cin%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u,u&#039;,u&#039;&#039; \in U' title='u,u&#039;,u&#039;&#039; \in U' class='latex' />). Given a partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />, the corresponding equivalence relation has <img src='http://s.wordpress.com/latex.php?latex=uR_%7B%5Cpi%7Du%27&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='uR_{\pi}u&#039;' title='uR_{\pi}u&#039;' class='latex' /> if <img src='http://s.wordpress.com/latex.php?latex=u%2Cu%27%20%5Cin%20B&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u,u&#039; \in B' title='u,u&#039; \in B' class='latex' /> for some block <img src='http://s.wordpress.com/latex.php?latex=B%20%5Cin%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B \in \pi' title='B \in \pi' class='latex' />. Given an equivalence relation <img src='http://s.wordpress.com/latex.php?latex=R%20%5Csubseteq%20U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R \subseteq U\times U' title='R \subseteq U\times U' class='latex' />, the corresponding partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi_%7BR%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi_{R}' title='\pi_{R}' class='latex' /> has as blocks the equivalence classes <img src='http://s.wordpress.com/latex.php?latex=%5Bu%5D%3D%5C%7Bu%27%3AuRu%27%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='[u]=\{u&#039;:uRu&#039;\}' title='[u]=\{u&#039;:uRu&#039;\}' class='latex' /> of the equivalence relation.</p>
<h3>Turning the partial ordering right side up</h3>
<p>Most of the previous work on partitions has been guided by thinking in terms of equivalence relations. For instance, there is a partial order on the set of partitions <img src='http://s.wordpress.com/latex.php?latex=%5CPi%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\Pi(U)' title='\Pi(U)' class='latex' /> on U. Given two partitions <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%3D%5C%7BB%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi=\{B\}' title='\pi=\{B\}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%3D%5C%7BC%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma =\{C\}' title='\sigma =\{C\}' class='latex' />, the corresponding equivalence relations <img src='http://s.wordpress.com/latex.php?latex=R_%7B%5Cpi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_{\pi}' title='R_{\pi}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=R_%7B%5Csigma%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_{\sigma}' title='R_{\sigma}' class='latex' /> are subsets of <img src='http://s.wordpress.com/latex.php?latex=U%20%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U \times U' title='U \times U' class='latex' /> so it might seem natural to order the partitions according to the inclusion ordering of the corresponding equivalence relations: <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cleq%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \leq \sigma' title='\pi \leq \sigma' class='latex' /> if <img src='http://s.wordpress.com/latex.php?latex=R_%7B%5Cpi%7D%5Csubseteq%20R_%7B%5Csigma%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_{\pi}\subseteq R_{\sigma}' title='R_{\pi}\subseteq R_{\sigma}' class='latex' /> rather than the opposite ordering.</p>
<p>That is the partial order traditionally used for partitions and it is even called the &#8220;refinement ordering.&#8221; For instance, this is the ordering used in the Wikipedia article for <a href="http://en.wikipedia.org/wiki/Lattice_%28order%29">lattices</a> in the first illustration giving the &#8220;lattice of partitions&#8221; on a four-element set. But in terms of blocks, it means that for any block <img src='http://s.wordpress.com/latex.php?latex=B%5Cin%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\in\pi' title='B\in\pi' class='latex' />, there is a block <img src='http://s.wordpress.com/latex.php?latex=C%5Cin%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='C\in\sigma' title='C\in\sigma' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=B%5Csubseteq%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\subseteq C' title='B\subseteq C' class='latex' /> so that the partition lower down on the so-called &#8220;refinement&#8221; ordering is more refined rather than less refined. The late <a href="http://en.wikipedia.org/wiki/Gian-Carlo_Rota">Gian-Carlo Rota</a> used to joke that it should be called the &#8220;unrefinement&#8221; ordering. Indeed, in a recent book <em>Combinatorics: The Rota Way</em> by two of Rota&#8217;s former students, that traditional ordering is aptly called the &#8220;reverse refinement&#8221; ordering.</p>
<p>For the purposes of seeing the analogies between the usual logic of subsets and the dual logic of partitions, it is important to use the reverse of the &#8220;reverse refinement&#8221; ordering which would be properly called the <em>refinement</em> ordering on partitions: <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5Cpreceq%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \preceq \pi' title='\sigma \preceq \pi' class='latex' /> if for any block <img src='http://s.wordpress.com/latex.php?latex=B%5Cin%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\in\pi' title='B\in\pi' class='latex' />, there is a block <img src='http://s.wordpress.com/latex.php?latex=C%5Cin%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='C\in\sigma' title='C\in\sigma' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=B%5Csubseteq%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\subseteq C' title='B\subseteq C' class='latex' />. With this ordering, the top 1 of the ordering is the discrete partition on U whose blocks are the singletons <img src='http://s.wordpress.com/latex.php?latex=%5C%7Bu%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\{u\}' title='\{u\}' class='latex' /> for the elements <img src='http://s.wordpress.com/latex.php?latex=u%5Cin%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u\in U' title='u\in U' class='latex' />. The bottom 0 of the ordering is the indiscrete partition, nicknamed the &#8220;blob,&#8221; whose only block contains all the elements of U.</p>
<h3>The closure space <img src='http://s.wordpress.com/latex.php?latex=U%20%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U \times U' title='U \times U' class='latex' /></h3>
<p>The complement of an equivalence relation <img src='http://s.wordpress.com/latex.php?latex=R_%7B%5Cpi%7D%20%5Csubseteq%20U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_{\pi} \subseteq U\times U' title='R_{\pi} \subseteq U\times U' class='latex' /> might be called a <em>partition relation</em>. The ordered pairs <img src='http://s.wordpress.com/latex.php?latex=%28u%2Cu%27%29%5Cin%20U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(u,u&#039;)\in U\times U' title='(u,u&#039;)\in U\times U' class='latex' /> in an equivalence relation <img src='http://s.wordpress.com/latex.php?latex=R_%7B%5Cpi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='R_{\pi}' title='R_{\pi}' class='latex' /> are the pairs of elements that are indistinct according to that equivalence relation so they may be called the<em> indistinctions</em> or, for short, <em>indits</em> of the relation and symbolized as: <img src='http://s.wordpress.com/latex.php?latex=indit%28%5Cpi%29%3DR_%7B%5Cpi%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='indit(\pi)=R_{\pi}' title='indit(\pi)=R_{\pi}' class='latex' />. The pairs in the complementary partition relation are the pairs of elements in distinct blocks of the partition so they might be called the <em>distinctions</em> or <em>dits</em> of the partition and symbolized as <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%29%3Dindit%28%5Cpi%29%5E%7Bc%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi)=indit(\pi)^{c}' title='dit(\pi)=indit(\pi)^{c}' class='latex' /> which is the complement of the indit set within the product set <img src='http://s.wordpress.com/latex.php?latex=U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U\times U' title='U\times U' class='latex' />.</p>
<p>The product set <img src='http://s.wordpress.com/latex.php?latex=U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U\times U' title='U\times U' class='latex' /> has a natural closure operation defined on it, namely the reflexive-symmetric-transitive closure. Thus the closure of any subset of the &#8220;closure space&#8221; <img src='http://s.wordpress.com/latex.php?latex=U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U\times U' title='U\times U' class='latex' /> is the smallest equivalence relation containing the set. The <em>closed</em> sets (the sets equal to their closure) are just the equivalence relations and also the indit sets of partitions on U. The complements of the closed sets are defined as usual as the <em>open</em> subsets which are thus the partition relations and the dit sets of partitions on U. But, <em>nota bene</em>, this is not a topological closure operation; the union of two closed subsets is not necessarily closed, i.e., the union of two equivalence relations is not necessarily an equivalence relation. With a closure operation and complementation, we can also as usual define the <em>interior</em> of a subset as the &#8220;complement of the closure of the complement.&#8221; Thus the interior of a subset is the largest dit set or open set contained in the subset.</p>
<h3>Three methods to define partition operations</h3>
<p>With this machinery, we are now ready to define operations on partitions such as the two lattice operations of join and meet, and the operation crucial for logic, the implication operation. There are three principal ways to define partition operations:</p>
<ul>
<li>the set-of-blocks method,</li>
<li>the closure-space method, and</li>
<li>the graph-theoretic method.</li>
</ul>
<p>A fourth method using complete subalgebras of powerset Boolean algebras will not be used here.</p>
<p>To illustrate the set-of-blocks method, suppose we are given two partitions <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%3D%5C%7BB%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi=\{B\}' title='\pi=\{B\}' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%3D%5C%7BC%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma =\{C\}' title='\sigma =\{C\}' class='latex' /> by their sets of blocks. The blocks of the join <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Clor%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \lor \sigma' title='\pi \lor \sigma' class='latex' /> are the non-empty intersections <img src='http://s.wordpress.com/latex.php?latex=B%5Ccap%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\cap C' title='B\cap C' class='latex' /> of the blocks of the two partitions. In the traditional treatment using the opposite partial ordering, this would be the &#8220;meet&#8221; of the two partitions. The set-of-blocks definition of the partition meet (i.e., the traditional &#8220;join&#8221;) is more complicated so we will use the other two methods described below to define it.</p>
<p>A simple illustration of the closure space method is also provided by the join. Given the two dit sets of partitions, the dit set of their join is just the union: <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%20%5Clor%20%5Csigma%29%3Ddit%28%5Cpi%29%20%5Ccup%20dit%28%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi \lor \sigma)=dit(\pi) \cup dit(\sigma)' title='dit(\pi \lor \sigma)=dit(\pi) \cup dit(\sigma)' class='latex' />. That defines the same partition as the set-of-blocks definition. One might similarly try to define the dit set of the meet <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cland%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \land \sigma' title='\pi \land \sigma' class='latex' /> as the intersection of the respective dit sets. But we previously noted that union of closed sets is not necessarily closed and thus the intersection of two open sets (i.e., dit sets) is not necessarily open. Hence we have to apply the interior operator, and this yields the correct definition: <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Cpi%20%5Cland%20%5Csigma%29%20%3D%20int%28dit%28%5Cpi%29%20%5Ccap%20dit%28%5Csigma%29%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\pi \land \sigma) = int(dit(\pi) \cap dit(\sigma))' title='dit(\pi \land \sigma) = int(dit(\pi) \cap dit(\sigma))' class='latex' />.</p>
<p>The third method of defining the &#8220;logical&#8221; partition operations is the graph-theoretic method which uses the ordinary truth tables from Boolean logic. Consider the complete simple (at most one arc or link between nodes and no loops at nodes) undirected graph on the node set U. Given a partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> on U, we label the arc connecting u and u&#8217; with <img src='http://s.wordpress.com/latex.php?latex=T%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\pi' title='T\pi' class='latex' /> if u and u&#8217; are in distinct blocks of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />, and we label the arc <img src='http://s.wordpress.com/latex.php?latex=F%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F\pi' title='F\pi' class='latex' /> if u and u&#8217; are in the same block of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' />. Given another partition <img src='http://s.wordpress.com/latex.php?latex=%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma' title='\sigma' class='latex' />, we can similarly label all the arcs with <img src='http://s.wordpress.com/latex.php?latex=T%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\sigma' title='T\sigma' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=F%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F\sigma' title='F\sigma' class='latex' /> so that each arc has two &#8220;signed&#8221; labels.</p>
<p>Then consider the Boolean truth table for whatever logical operation one wants to define such as the meet.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr style="text-align: center;">
<td width="106" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /></td>
<td width="122" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma' title='\sigma' class='latex' /></td>
<td width="125" valign="top"><img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cland%20%20%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \land   \sigma' title='\pi \land   \sigma' class='latex' /></td>
</tr>
<tr style="text-align: center;">
<td width="106" valign="top">T</td>
<td width="122" valign="top">T</td>
<td width="125" valign="top">T</td>
</tr>
<tr style="text-align: center;">
<td width="106" valign="top">T</td>
<td width="122" valign="top">F</td>
<td width="125" valign="top">F</td>
</tr>
<tr style="text-align: center;">
<td width="106" valign="top">F</td>
<td width="122" valign="top">T</td>
<td width="125" valign="top">F</td>
</tr>
<tr style="text-align: center;">
<td width="106" valign="top">T</td>
<td width="122" valign="top">F</td>
<td width="125" valign="top">F</td>
</tr>
</tbody>
</table>
<p>The truth table can then be summarized by specifying that the Boolean conditions for <img src='http://s.wordpress.com/latex.php?latex=T%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T(\pi \land \sigma)' title='T(\pi \land \sigma)' class='latex' /> are &#8220;<img src='http://s.wordpress.com/latex.php?latex=T%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\pi' title='T\pi' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=T%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\sigma' title='T\sigma' class='latex' />&#8221; while the Boolean conditions for <img src='http://s.wordpress.com/latex.php?latex=F%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F(\pi \land \sigma)' title='F(\pi \land \sigma)' class='latex' /> are &#8220;<img src='http://s.wordpress.com/latex.php?latex=F%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F\pi' title='F\pi' class='latex' /> or <img src='http://s.wordpress.com/latex.php?latex=F%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F\sigma' title='F\sigma' class='latex' />.&#8221; Returning now to complete graph, we see that each arc has either the true conditions <img src='http://s.wordpress.com/latex.php?latex=T%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T(\pi \land \sigma)' title='T(\pi \land \sigma)' class='latex' /> or the false conditions <img src='http://s.wordpress.com/latex.php?latex=F%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F(\pi \land \sigma)' title='F(\pi \land \sigma)' class='latex' /> holding on it. Then we throw away all the arcs where the true conditions <img src='http://s.wordpress.com/latex.php?latex=T%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T(\pi \land \sigma)' title='T(\pi \land \sigma)' class='latex' /> hold, and we retain all the other arcs which are the ones where the false conditions <img src='http://s.wordpress.com/latex.php?latex=F%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F(\pi \land \sigma)' title='F(\pi \land \sigma)' class='latex' /> hold.</p>
<p>Given any simple undirected graph on the node set U, two nodes are <em>connected</em> if there is a finite chain of links connecting the two nodes. Then connectedness immediately determines a partition on the node set where the blocks are the sets of nodes which are connected to one another. To define the partition <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cland%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \land \sigma' title='\pi \land \sigma' class='latex' />, simply use the graph constructed above where the links are the ones where the false conditions <img src='http://s.wordpress.com/latex.php?latex=F%28%5Cpi%20%5Cland%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F(\pi \land \sigma)' title='F(\pi \land \sigma)' class='latex' /> hold. The partition determined on U by the connected components is the partition meet <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cland%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \land \sigma' title='\pi \land \sigma' class='latex' />. It is the same partition previously by the closure-space method.</p>
<h3>The importance of the implication operation</h3>
<p>Logic studies valid formulas or tautologies which can be defined as formulas that will always evaluate to the top 1 regardless of what subsets or partitions are substituted for the atomic variables. But if the only operations are the lattice operations of join and meet, then the only tautologies in either the lattice of subsets <img src='http://s.wordpress.com/latex.php?latex=%5Cmathcal%7BP%7D%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\mathcal{P}(U)' title='\mathcal{P}(U)' class='latex' /> of  U or the lattice of partitions <img src='http://s.wordpress.com/latex.php?latex=%5CPi%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\Pi(U)' title='\Pi(U)' class='latex' /> on U are trivialities such as <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Clor%201&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \lor 1' title='\pi \lor 1' class='latex' />. And as three Rota associates noted in a 2001 commemorative volume: &#8220;the only operations on the family of equivalence relations fully studied, understood and deployed are the binary join <img src='http://s.wordpress.com/latex.php?latex=%5Clor&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lor' title='\lor' class='latex' /> and meet <img src='http://s.wordpress.com/latex.php?latex=%5Cland&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\land' title='\land' class='latex' /> operations.&#8221;</p>
<p>Instead of studying tautologies, one might try to study identities, e.g., <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cland%20%5Csigma%20%5Cpreceq%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \land \sigma \preceq \pi' title='\pi \land \sigma \preceq \pi' class='latex' />, which are partial order statements between two formulas that are true for any subsets or partitions substituted for the atomic variables in the formulas. But Philip Whitman showed in 1946 that lattices of partitions are so versatile that any lattice can be embedded in a lattice of partitions. Hence any identity true in all partition lattices would be true in all lattices. It would be a general lattice-theoretic identity saying nothing specific about partitions. In order to get some specific identities involving lattice-theoretic formulas (i.e., only using join and meet), one would have to restrict the class of partition lattices. Indeed, in the closest previous work, Rota and colleagues developed a &#8220;logic of commuting equivalence relations&#8221; using only the lattice operations but obtaining non-trivial identities by restricting to lattices of commuting equivalence relations [Finberg, David, Matteo Mainetti and Gian-Carlo Rota 1996. The Logic of Commuting Equivalence Relations. In <em>Logic and Algebra</em>. A. Ursini and P. Agliano eds., New York: Marcel Dekker<strong>: </strong>69-96].</p>
<p>But the study of identities is quite restrictive in comparison with the study of tautologies. When one has the implication operation, then corresponding to every identity, there is a tautology obtained by replacing the partial ordering relation by the implication. For instance the identity <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5Cpreceq%20%5Cpi%20%5Clor%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \preceq \pi \lor \sigma' title='\pi \preceq \pi \lor \sigma' class='latex' /> transforms into the tautology <img src='http://s.wordpress.com/latex.php?latex=%5Cpi%20%5CRightarrow%20%28%5Cpi%20%5Clor%20%5Csigma%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi \Rightarrow (\pi \lor \sigma)' title='\pi \Rightarrow (\pi \lor \sigma)' class='latex' />. But even rather simple tautologies like <em>modus ponens</em>, <img src='http://s.wordpress.com/latex.php?latex=%28%5Csigma%20%5Cland%20%28%5Csigma%20%5CRightarrow%20%5Cpi%29%29%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='(\sigma \land (\sigma \Rightarrow \pi)) \Rightarrow \pi' title='(\sigma \land (\sigma \Rightarrow \pi)) \Rightarrow \pi' class='latex' />, do not express identities between lattice formulas (due to the appearance of the other implication sign in the premise).</p>
<h3>Defining the partition implication operation</h3>
<p>Implication is the quintessentially logical operation, and the definition of this operation on partitions was a key step in the development of partition logic. We have three methods to define partition operations, and each one suggests an appropriate definition of implication.</p>
<p>As noted in the remark above, one desideratum in the implication is: <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5Cpreceq%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \preceq \pi' title='\sigma \preceq \pi' class='latex' /> if and only if <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi%20%3D%201&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi = 1' title='\sigma \Rightarrow \pi = 1' class='latex' />. In the lattice of partitions <img src='http://s.wordpress.com/latex.php?latex=%5CPi%28U%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\Pi(U)' title='\Pi(U)' class='latex' /> the top 1 is the discrete partition of all singletons. Now <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5Cpreceq%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \preceq \pi' title='\sigma \preceq \pi' class='latex' /> means that for any <img src='http://s.wordpress.com/latex.php?latex=B%5Cin%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\in \pi' title='B\in \pi' class='latex' /> there is a <img src='http://s.wordpress.com/latex.php?latex=C%5Cin%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='C\in \sigma' title='C\in \sigma' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=B%5Csubseteq%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\subseteq C' title='B\subseteq C' class='latex' />. Hence a suitable candidate definition would be to say that for any <img src='http://s.wordpress.com/latex.php?latex=B%5Cin%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B\in \pi' title='B\in \pi' class='latex' /> such that there is a <img src='http://s.wordpress.com/latex.php?latex=C%5Cin%20%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='C\in \sigma' title='C\in \sigma' class='latex' /> with <img src='http://s.wordpress.com/latex.php?latex=B%20%5Csubseteq%20C&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='B \subseteq C' title='B \subseteq C' class='latex' />, then B is discretized (i.e., replaced by singletons) in <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi' title='\sigma \Rightarrow \pi' class='latex' />, and otherwise B remains whole in <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi' title='\sigma \Rightarrow \pi' class='latex' />. Thus the candidate definition of <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi' title='\sigma \Rightarrow \pi' class='latex' /> is like <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> except that whenever a block of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is contained in a block of <img src='http://s.wordpress.com/latex.php?latex=%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma' title='\sigma' class='latex' />, then that block of <img src='http://s.wordpress.com/latex.php?latex=%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\pi' title='\pi' class='latex' /> is discretized.</p>
<p>Using the closure space method, the analogy with the topological interpretation of intuitionistic propositional logic suggests itself. In the usual subset interpretation, the implication <img src='http://s.wordpress.com/latex.php?latex=S%5CRightarrow%20T&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S\Rightarrow T' title='S\Rightarrow T' class='latex' /> is defined as <img src='http://s.wordpress.com/latex.php?latex=S%5E%7Bc%7D%5Ccup%20T&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S^{c}\cup T' title='S^{c}\cup T' class='latex' /> for subsets <img src='http://s.wordpress.com/latex.php?latex=S%2CT%20%5Csubseteq%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S,T \subseteq U' title='S,T \subseteq U' class='latex' />. In the intuitionistic topological interpretation, S and T would be open subsets of a topological space U, and the interior operator would have to be added to ensure that the result was open: <img src='http://s.wordpress.com/latex.php?latex=S%5CRightarrow%20T%20%3D%20int%28S%5E%7Bc%7D%5Ccup%20T%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='S\Rightarrow T = int(S^{c}\cup T)' title='S\Rightarrow T = int(S^{c}\cup T)' class='latex' />. Since we have &#8220;open subsets&#8221; and an &#8220;interior&#8221; operator in the closure space treatment of partitions, the obvious suggested definition of the implication would be: <img src='http://s.wordpress.com/latex.php?latex=dit%28%5Csigma%20%5CRightarrow%20%5Cpi%29%3Dint%28dit%28%5Csigma%29%5E%7Bc%7D%5Ccup%20dit%28%5Cpi%29%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='dit(\sigma \Rightarrow \pi)=int(dit(\sigma)^{c}\cup dit(\pi))' title='dit(\sigma \Rightarrow \pi)=int(dit(\sigma)^{c}\cup dit(\pi))' class='latex' />.</p>
<p>And finally there is the graphical method that allows us to generate a partition operation from a Boolean truth table. Hence we take the Boolean truth table for the implication <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi' title='\sigma \Rightarrow \pi' class='latex' />, and we note that the Boolean conditions for <img src='http://s.wordpress.com/latex.php?latex=F%28%5Csigma%20%5CRightarrow%20%5Cpi%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F(\sigma \Rightarrow \pi)' title='F(\sigma \Rightarrow \pi)' class='latex' /> are &#8220;<img src='http://s.wordpress.com/latex.php?latex=T%5Csigma&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\sigma' title='T\sigma' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=F%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='F\pi' title='F\pi' class='latex' />.&#8221; Then we retain the arcs of the complete graph labelled <img src='http://s.wordpress.com/latex.php?latex=T%5Csigma%2CF%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\sigma,F\pi' title='T\sigma,F\pi' class='latex' /> and throw out the other arcs. The connected components in the resulting graph are the blocks in this candidate definition of <img src='http://s.wordpress.com/latex.php?latex=%5Csigma%20%5CRightarrow%20%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\sigma \Rightarrow \pi' title='\sigma \Rightarrow \pi' class='latex' />. In the following example of a graph, the arcs where the false conditions <img src='http://s.wordpress.com/latex.php?latex=T%5Csigma%2CF%5Cpi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='T\sigma,F\pi' title='T\sigma,F\pi' class='latex' /> are thickened.</p>
<div id="attachment_13" class="wp-caption alignleft" style="width: 320px"><a href="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Fig2-graph-for-implication2.jpg"><img class="size-full wp-image-13 " title="Fig2-graph-for-implication" src="http://www.mathblog.ellerman.org/wp-content/uploads/2010/02/Fig2-graph-for-implication2.jpg" alt="" width="310" height="182" /></a><p class="wp-caption-text">Graph for partition implication</p></div>
<p>All three of the definitions are equivalent, and that is the definition used in the development of partition logic. My preprint of the introductory paper on partition logic forthcoming in <em>The Review of Symbolic Logic</em> can be downloaded <a href="http://www.ellerman.org/Davids-Stuff/Maths/Logic-of-Partitions-Reprint.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/02/the-implication-operation-on-partitions-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>From propositional logic to subset logic to partition logic</title>
		<link>http://www.mathblog.ellerman.org/2010/01/from-propositional-logic-to-subset-logic-to-partition-logic/</link>
		<comments>http://www.mathblog.ellerman.org/2010/01/from-propositional-logic-to-subset-logic-to-partition-logic/#comments</comments>
		<pubDate>Fri, 29 Jan 2010 05:30:06 +0000</pubDate>
		<dc:creator>David Ellerman</dc:creator>
				<category><![CDATA[Category theory]]></category>
		<category><![CDATA[Math logic]]></category>
		<category><![CDATA[Partition logic]]></category>
		<category><![CDATA[Philosophy]]></category>
		<category><![CDATA[Boole]]></category>
		<category><![CDATA[Lawvere]]></category>
		<category><![CDATA[propositional logic]]></category>
		<category><![CDATA[subset logic]]></category>

		<guid isPermaLink="false">http://www.mathblog.ellerman.org/?p=4</guid>
		<description><![CDATA[The modern category-theoretic treatment of logic, the variables in "propositional" logic should be interpreted as subsets of some given nonempty universe set U, i.e., propositional logic is subset logic. Since partitions on a set are dual to subsets of the set, the idea arises of a dual logic of partitions.]]></description>
			<content:encoded><![CDATA[<h3>From propositional logic to subset logic</h3>
<p>This note outlines the following sequence of ideas. First, ordinary propositional logic is reinterpreted as the logic of subsets of a universe set U, with the propositional case being isomorphic to the special case of U = 1. Then the category-theoretic duality between subsets of a set and partitions on a set is used to broach the idea of a dual logic of partitions. At the end of the note, a link is given to the final draft of my forthcoming paper in the <em>Review of Symbolic Logic</em> which develops the logic of partitions from the basic ideas up through the correctness and completeness theorems for a tableau system of (zeroth order) partition logic.</p>
<p><span id="more-4"></span></p>
<p>Largely due to the efforts of F. William Lawvere in category theory, there a &#8220;rethinking&#8221; of logic taking place. Lawvere&#8217;s best accessible restatement of logic is <em>Appendix A: Logic as the Algebra of Parts</em> in: Lawvere, F. William and Robert Rosebrugh 2003, <em>Sets for Mathematics</em>. Cambridge: Cambridge University Press.</p>
<p>For instance, what is the subject matter of the &#8220;propositional&#8221; part of mathematical logic since the notion of a &#8220;proposition&#8221; is far from being a precise mathematical concept? In Lawvere&#8217;s vision of &#8220;Logic as the Algebra of Parts,&#8221; a &#8220;part&#8221; is just Lawvere-speak for a &#8220;subobject.&#8221; For present purposes, we may restrict attention to the category of sets where &#8220;subobject&#8221; just means &#8220;subset.&#8221; Hence under this interpretation, the atomic variables in the formulas of &#8220;propositional&#8221; or zeroth-order logic would be interpreted not as propositions but as subsets of some fixed (non-empty) universe set U. The &#8220;connectives&#8221; of zeroth-order logic would be interpreted not as operations on propositions (such as &#8220;and&#8221; or &#8220;or&#8221;) but as operations on subsets (such as intersection or union).</p>
<blockquote>
<p style="text-align: justify;"><span style="color: #000000;">The propositional calculus considers &#8220;Propositions&#8221; p, q, r,&#8230; combined under the operations &#8220;and&#8221;,&#8221;or&#8221;, &#8220;implies&#8221;, and &#8220;not&#8221;, often written as <img src='http://s.wordpress.com/latex.php?latex=p%5Cland%20q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p\land q' title='p\land q' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=p%5Clor%20q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p\lor q' title='p\lor q' class='latex' />, <img src='http://s.wordpress.com/latex.php?latex=p%5CRightarrow%20q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='p\Rightarrow q' title='p\Rightarrow q' class='latex' />, and <img src='http://s.wordpress.com/latex.php?latex=%5Clnot%20p&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lnot p' title='\lnot p' class='latex' />. Alternatively, if P, Q, R,&#8230; are subsets of some fixed set U with elements u, each proposition p may be replaced by the proposition <img src='http://s.wordpress.com/latex.php?latex=u%5Cin%20P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='u\in P' title='u\in P' class='latex' /> for some subset <img src='http://s.wordpress.com/latex.php?latex=P%5Csubset%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P\subset U' title='P\subset U' class='latex' />; the propositional connectives above then become operations on subsets; intersection <img src='http://s.wordpress.com/latex.php?latex=%5Cland&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\land' title='\land' class='latex' />, union <img src='http://s.wordpress.com/latex.php?latex=%5Clor&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lor' title='\lor' class='latex' />, implication (<img src='http://s.wordpress.com/latex.php?latex=P%5CRightarrow%20Q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P\Rightarrow Q' title='P\Rightarrow Q' class='latex' /> is <img src='http://s.wordpress.com/latex.php?latex=%5Clnot%20P%5Clor%20Q&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\lnot P\lor Q' title='\lnot P\lor Q' class='latex' />), and complement of subsets. [Mac Lane, Saunders and Ieke Moerdijk 1992. </span><em>Sheaves in Geometry and Logic: A First Introduction to Topos Theory.</em><span style="color: #000000;"> New York: Springer, p. 48]</span></p>
</blockquote>
<p>A <em>tautology</em> of zeroth-order logic would then be a formula such that, regardless of what subsets of  U are assigned to the atomic variables, would always evaluate to the universe set U. We might call this a <em>subset tautology</em> of zeroth-order logic.</p>
<p>One special case of the universe set U is 1, &#8220;the&#8221; singleton set with only two subsets, itself and the null set, which we could denote as 1 and 0. The subset operations on these two subsets could be defined by the usual &#8220;truth tables&#8221; for the &#8220;propositional connectives&#8221; where a &#8220;proposition&#8221; is mathematically modeled as a variable with the possible values 0 and 1 (for F and T). Then one arrives at the notion of a <em>truth-table tautology</em>, which is a zeroth-order formula that, regardless of the assignments of 0 or 1 to the atomic variables, would always evaluate to the value 1. Now, clearly any subset tautology is also a truth table tautology since U = 1 is only a special case for the universe set U. It is an interesting and philosophically &#8220;fateful&#8221; fact, that the converse is true. A zeroth-order formula that always evaluates to the universe set U = 1 in that very special case will evaluate to the universe set U for arbitrary non-empty U when arbitrary subsets of U are substituted for the atomic variables. Hence the distinct concepts of a subset tautology and a truth-table tautology turn out to be the same in the context of subsets of a universe set U.</p>
<p>The fact that &#8220;subset tautology = truth-table tautology&#8221; is philosophical fateful since it has allowed the very special case of U = 1, i.e., the propositional interpretation, to completely &#8220;take over&#8221; the interpretation of zeroth-order logic and hence the very name of &#8220;propositional logic.&#8221; The 0-1 propositional interpretation is, to be sure, an extremely important special case for U, and a special case that suffices to delimit the zeroth-order valid formulas, but the broader concept of zeroth-order logic is <em>subset logic</em>, not just propositional logic.</p>
<h3>The historical background</h3>
<p>In Alonzo Church&#8217;s &#8220;Historical notes&#8221; on the &#8220;propositional calculus,&#8221; he noted that Boole actually developed the algebra of subsets.</p>
<blockquote>
<p style="text-align: justify;"><span style="color: #000000;">The algebra of logic has its beginning in 1847, in the publications of Boole and De Morgan. This concerned itself at first with an algebra or calculus of classes, to which a similar algebra of relations was later added. Though it was foreshadowed in Boole&#8217;s treatment of &#8220;Secondary Propositions,&#8221; a true propositional calculus perhaps first appeared from this point of view in the work of Hugh MacColl, beginning in 1877.<em> </em>[Church, Alonzo 1956. </span><em>Introduction to Mathematical Logic.</em><span style="color: #000000;"> Princeton: Princeton University Press, pp. 155-6]</span></p>
</blockquote>
<p>In contrast, Frege used the propositional interpretation from the outset in his <em>Begriffsschrift</em> of 1879. Boole, on the other hand, noted that the propositional interpretation was essentially a special case.</p>
<blockquote>
<p style="text-align: justify;"><span style="color: #000000;">But while the laws and processes of the method remain unchanged, the rule of interpretation must be adapted to new conditions. Instead of classes of things, we shall have to substitute propositions, and for the relations of classes and individuals, we shall have to consider the connexions of propositions or of events. [Boole, George 1854. </span><em>An Investigation of the Laws of Thought on which are founded the Mathematical Theories of Logic and Probabilities.</em><span style="color: #000000;"> Cambridge: Macmillan and Co., p. 162]</span></p>
</blockquote>
<p>Moreover Boole emphasized the fateful fact that for the purposes of valid calculations with what we now call &#8220;Boolean operations,&#8221; it suffices to take the values to be just 0 or 1.</p>
<blockquote>
<p style="text-align: justify;"><span style="color: #000000;"><em>We may in fact lay aside the logical interpretation of the symbols in the given equation; convert them into quantitative symbols, susceptible only of the values 0 and 1; perform upon them as such all the requisite processes of solution; and finally restore them to their logical interpretation.</em> [Boole, Ibid., p. 70 (his italics)]</span></p>
</blockquote>
<p>From the viewpoint of Lawvere&#8217;s &#8220;Logic as the Algebra of Parts,&#8221; Boole&#8217;s treatment of logic as the algebra of subsets (today called &#8220;Boolean algebra&#8221;), with the propositional interpretation being an important special case, is surprisingly modern.</p>
<h3>Subset logic with general quantifiers</h3>
<p>But what about quantification theory or first-order logic where the quantifiers date back to Frege, not to Boole? In Lawvere&#8217;s treatment of logic in the context of sets, the &#8220;quantifiers&#8221; arise naturally out of set functions between universe sets. Given a function <img src='http://s.wordpress.com/latex.php?latex=f%3A%20U%20%5Crightarrow%20V&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f: U \rightarrow V' title='f: U \rightarrow V' class='latex' /> from one universe set to another, and given a subset <img src='http://s.wordpress.com/latex.php?latex=P%20%5Csubseteq%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P \subseteq U' title='P \subseteq U' class='latex' />, the &#8220;existential quantifer&#8221; <img src='http://s.wordpress.com/latex.php?latex=%5Cexists%20_%7Bf%7DP&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\exists _{f}P' title='\exists _{f}P' class='latex' /> is the direct image subset <img src='http://s.wordpress.com/latex.php?latex=f%28P%29%20%5Csubseteq%20V&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(P) \subseteq V' title='f(P) \subseteq V' class='latex' />, i.e., the subset of all elements v of V such that there exists an element u of U with f(u) = v.  And the &#8220;universal quantifier&#8221; <img src='http://s.wordpress.com/latex.php?latex=%5Cforall%20_%7Bf%7DP&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\forall _{f}P' title='\forall _{f}P' class='latex' /> is the subset of all elements v in V  such that <img src='http://s.wordpress.com/latex.php?latex=f%5E%7B-1%7D%28v%29%20%5Csubseteq%20P&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f^{-1}(v) \subseteq P' title='f^{-1}(v) \subseteq P' class='latex' />.</p>
<p>Thus zeroth-order logic is concerned with the logical operations on the subsets of a given universe set U, and first-order logic adds two canonical ways that subsets of one universe map to subsets of another universe given a set function <img src='http://s.wordpress.com/latex.php?latex=f%3A%20U%20%5Crightarrow%20V&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f: U \rightarrow V' title='f: U \rightarrow V' class='latex' /> from the one universe to the other.</p>
<p>As Church noted, the algebra of relations was seen historically as a later extension of Boole&#8217;s algebra of subsets, but from the mathematical viewpoint, a relation is a subset of some product set <img src='http://s.wordpress.com/latex.php?latex=U%5Ctimes%5Cldots%5Ctimes%20W&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U\times\ldots\times W' title='U\times\ldots\times W' class='latex' /> and that product set can be taken as the universe set in the powerset Boolean algebra. In that special case, where the universe set is a product set, say the 3rd power <img src='http://s.wordpress.com/latex.php?latex=U%5E%7B3%7D%20%3D%20U%5Ctimes%20U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U^{3} = U\times U\times U' title='U^{3} = U\times U\times U' class='latex' />, then there are certain canonical mappings to a lower power <img src='http://s.wordpress.com/latex.php?latex=U%5E%7B2%7D%20%3D%20U%5Ctimes%20U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U^{2} = U\times U' title='U^{2} = U\times U' class='latex' />, namely the 3 projection maps that leave out the first, second, or third &#8220;coordinates.&#8221; In this very special case of the projection maps between product sets, then the two Lawvere-quantifiers are the two  usual quantifiers of &#8220;quantification theory.&#8221; The last projection map <img src='http://s.wordpress.com/latex.php?latex=U%20%3D%20U%5E%7B1%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U = U^{1}' title='U = U^{1}' class='latex' /> to <img src='http://s.wordpress.com/latex.php?latex=U%5E%7B0%7D%3D%201&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U^{0}= 1' title='U^{0}= 1' class='latex' /> would induce quantifiers that take subsets of U to subsets of 1 where the propositional interpretation holds sway for the subsets of 1.</p>
<p>In sum, in the modern Lawvere treatment of logic as the algebra of subsets (in the case of the category of sets), zeroth-order logic deals with logical operations on subsets of any non-empty universe set U, and first-order logic deals with the two quantifier morphisms taking subsets of one universe U to subsets of another universe V which are induced by a given set map <img src='http://s.wordpress.com/latex.php?latex=f%3A%20U%20%5Crightarrow%20V&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f: U \rightarrow V' title='f: U \rightarrow V' class='latex' />. The usual &#8220;propositional logic&#8221; is arrived at by taking the special case of U = 1, and the usual &#8220;quantification theory&#8221; is arrived at by taking the special case of the underlying sets as the powers of a set U and the set maps as being the projections from an n<sup>th</sup> power U<sup>n</sup> to the n-1<sup>st</sup> powers U<sup>n-1</sup>.</p>
<h3>From subset logic to partition logic</h3>
<p>I have a vested interest in the broader interpretation of logic. If logic is restricted to the special case of propositions and quantified formulas, then there is no notion of a &#8220;dual&#8221; of a proposition or quantified formula. But if logic is seen broadly as the algebra of subsets on a universe set (and the morphisms of subsets induced by set functions between universe sets), then there is a natural notion of the dual of a subset or &#8220;part.&#8221; &#8220;The dual notion (obtained by reversing the arrows) of &#8216;part&#8217; is the notion of partition.&#8221; [Lawvere and Rosebrugh, op. cit., p. 85] This has long been seen in the duality between the notions of subobject and quotient object throughout modern algebra (e.g., subgroup versus quotient group). In view of that duality, if logic is the logic of subsets, then the idea arises of a dual logic of partitions. For over a century there has been a rudimentary version of that dual logic for just the meet and join operations in the lattice of partitions on a universe set U studied in combinatorial theory and algebra. But now the implication operation as well as the other logical operations on partitions have been successfully defined so that the zeroth-order logic of partitions can be fully developed.  My recent paper in the <em>Review of Symbolic Logic</em> : &#8220;The Logic of Partitions: Introduction to the Dual of the Logic of Subsets,&#8221;  goes from basic concepts up through the correctness and completeness theorems for a tableau system of (zeroth-order) partition logic. My reprint of the paper can be downloaded from my website <a href="http://www.ellerman.org/Davids-Stuff/Maths/Logic-of-Partitions-Reprint.pdf">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.mathblog.ellerman.org/2010/01/from-propositional-logic-to-subset-logic-to-partition-logic/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

