Archive for the ‘math’ tag
Math Proofs are Complex
Computer scientists believe that all problems solvable by a computer can be divided into two classes: easy (“Polynomial”) and hard (“Non-Polynomial”). The most important problem in computer science is to prove that these two classes are distinct and there isn’t some trick to make hard problems easy (“P ≠ NP“). Among other applications, if P = NP, then all encryption is broken.
A guy named Vinay Deolalikar has been passing around a 100-page paper that might be a proof that P ≠ NP. The paper passes the Ten Signs a Claimed Mathematical Breakthrough is Wrong checklist (scroll down). But, even moreso than previous famous proofs of Fermat’s Last Theorem and the Poincare Conjecture, Deolalikar’s proof spans so many subjects that there’s almost no one in the world who can understand the whole thing: “logic, statistics, graphical models, random ensembles, and statistical physics”.
As this Slashdot comment points out, that shouldn’t be a problem:
If he did his work right the proof of each individual [step] is a trivial thing; only their connection is non-trivial.
But personally, being trained in computer science and formal logic, and believing the realitivist theory of mathematics, I’ve always been suspicious of mathematical proofs. As this Slashdot comment explains, I don’t see why I should trust the peer review process to scale up to proofs of this complexity. I know mathematicians think I’m crazy, but I can’t help but think that mathematical proofs should be machine-verifiable. That way, as this Slashdot commenter jokes:
We’ll know that P != NP if it takes us less time to verify the proof as it took him to generate it.
Detecting Election Fraud
Benford’s Law says that if you take a list of numbers from certain processes, lots of them will start with “1″ and very few of them will start with a “9″. This is because we use a base-10 counting system, so if something is growing at a steady average rate it takes just as long to get to 2000 as it took to get to 1000. eg: 139, 253, 443, 463, 585, 745, 884, 1028, 1108, 1299, 1424, 1514, 1531, 1710, 1818, 2051*. The chances that you sample it when it starts with a “1″ are higher than the chance that you sample it when it starts with anything else.
Benford’s Law is good at detecting financial fraud because financial calculations have patterns that cause steady average grow: most importantly, multiplying quantities and prices. Since most electoral systems divide voters into equal-sized voting areas, the votes in each area don’t grow at a steady rate. So instead electoral analysts skip the first digit and apply Benford’s Law to the later digits in results.
This analysis assumes that votes for each candidate will grow at a steady rate. In other words, if you combine two voting areas, the results for each candidate should, on average, double. But representational democracies allocate representatives based on the regions precisely because that assumption is wrong.
Other election analysis assume that voters, when averaged together, act completely randomly. If that assumption is correct, I’m not sure why fraudulent elections are a bad thing.
Statistical analysis to detect election fraud is a very new field. Much of the work is being done in the US, which means researchers have certain biases:
- they don’t understand the boundaries of the places they’re analyzing
- they have incentives to find fraud in the elections of America’s enemies
- they have disincentives to find fraud in the elections of Western democracies (in particular it is taboo to analyze the 2000 Presidential election)
- they assume divisive bipartisan politics
I’d love to see a paper where someone analyzes a real election known to be fraudulent rather than a simulation or at least applies a method to a large, broad sample of elections.
* I generated these using a uniform distribution from 0 to 200.


