Recently I came across a nice read about entropy by Cover and Thomas. I want to summarize what entropy ‘physically’ means from information coding perspective.
The definition of entropy for a discretely valued random variable is:
Entropy does not depend on the values that takes. It describes distributional properties of . The unit for entropy is 1 bit. To express some number in bits take base 2 log of it.
What does entropy measure? Suppose person observes the value of , and needs to communicate this value to person by sending a binary message. This message can be of any length. Person wants to choose encoding scheme to minimize the expected length of message. A good strategy is to order possible values of according to their probabilities in decreasing order. Lets denote this ordering by . Among many possible representations, the following will work:
and so on. The details of this representation are less relevant. The key simple point is to encode most probable values of by shorter strings. One can show, that under the optimal representation the expected length of the string will be between and . This is exactly what entropy measures. Notice, if each of the possible values of is equally likely, then no ‘smart’ representation can help reduce the average length of the string. This is the case of a maximal entropy which corresponds to the uniform distribution.
Kullback–Leibler distance:
This quantity measures the distance between two distributions. It measures by how longer, on average, message becomes, if its encoding scheme is optimized for distribution , when true distribution is .
Mutual information:
This quantity measures by how longer message becomes, on average, if its encoding scheme is optimized based on assumption that and are independent and using only knowledge of their marginals.

Recent Posts
 cvxpy, cvxopt and convex optimization July 2, 2013
 Merging in mercurial with vimdiff June 24, 2013
 Running script in background on linux June 5, 2013
 Installing Scipy on Linux Red Hat May 29, 2013
 Resolving import errors in Python May 15, 2013
 Switching back and forth between tabs and spaces for Python indentation May 10, 2013
 Problems with character encoding when piping output from a python script to a file May 6, 2013
 Installing extensions for Mercurial May 1, 2013
 Running unit tests in Python April 25, 2013
 Great unix one liner to join two tables keys in the first columns April 23, 2013
 Add a new host to known_hosts April 15, 2013
 Python things April 13, 2013
 Interview brainteaser asked at Google February 15, 2013
 Popular computer science interview question: thieves locking treasure in cryptographic way :) February 15, 2013
 C++: some simple pointer arithmetic February 12, 2013
 Things I was never aware of: C++. Comma operator. February 12, 2013
 Binary Search Trees February 12, 2013
 Mounting memory card on Ubuntu February 9, 2013
 Interview questions on Perl February 7, 2013
 Get a notebook! :) February 7, 2013
 Interview question: what is the expected number of occurrences of a “HH” pattern in n tosses of an unbiased coin? February 7, 2013
 Pulling info about running processes on Ubuntu February 4, 2013
 Opening files in Python using “with” block to type less February 4, 2013
 Executing JavaScript and getting returned value back to Python, when using Selenium webdriver February 4, 2013
 Note on a differencing and lag operators in time series modeling February 2, 2013
 Running Selenium Server on Ubuntu 10.04 February 2, 2013
 Using Selenium with Python for Web Crawling February 1, 2013
 Reversing linked list January 24, 2013
 Sorting Algorithms January 24, 2013
 Interview questions on UNIX January 24, 2013
Categories
Archives
Advertisements