More on Following Benford's law

Book Search

Download this chapter in PDF format

Chapter34.pdf

1: The Breadth and Depth of DSP
- The Roots of DSP
- Telecommunications
- Audio Processing
- Echo Location
- Image Processing
2: Statistics, Probability and Noise
- Signal and Graph Terminology
- Mean and Standard Deviation
- Signal vs. Underlying Process
- The Histogram, Pmf and Pdf
- The Normal Distribution
- Digital Noise Generation
- Precision and Accuracy
3: ADC and DAC
- Quantization
- The Sampling Theorem
- Digital-to-Analog Conversion
- Analog Filters for Data Conversion
- Selecting The Antialias Filter
- Multirate Data Conversion
- Single Bit Data Conversion
4: DSP Software
- Computer Numbers
- Fixed Point (Integers)
- Floating Point (Real Numbers)
- Number Precision
- Execution Speed: Program Language
- Execution Speed: Hardware
- Execution Speed: Programming Tips
5: Linear Systems
- Signals and Systems
- Requirements for Linearity
- Static Linearity and Sinusoidal Fidelity
- Examples of Linear and Nonlinear Systems
- Special Properties of Linearity
- Superposition: the Foundation of DSP
- Common Decompositions
- Alternatives to Linearity
6: Convolution
- The Delta Function and Impulse Response
- Convolution
- The Input Side Algorithm
- The Output Side Algorithm
- The Sum of Weighted Inputs
7: Properties of Convolution
- Common Impulse Responses
- Mathematical Properties
- Correlation
- Speed
8: The Discrete Fourier Transform
- The Family of Fourier Transform
- Notation and Format of the Real DFT
- The Frequency Domain's Independent Variable
- DFT Basis Functions
- Synthesis, Calculating the Inverse DFT
- Analysis, Calculating the DFT
- Duality
- Polar Notation
- Polar Nuisances
9: Applications of the DFT
- Spectral Analysis of Signals
- Frequency Response of Systems
- Convolution via the Frequency Domain
10: Fourier Transform Properties
- Linearity of the Fourier Transform
- Characteristics of the Phase
- Periodic Nature of the DFT
- Compression and Expansion, Multirate methods
- Multiplying Signals (Amplitude Modulation)
- The Discrete Time Fourier Transform
- Parseval's Relation
11: Fourier Transform Pairs
- Delta Function Pairs
- The Sinc Function
- Other Transform Pairs
- Gibbs Effect
- Harmonics
- Chirp Signals
12: The Fast Fourier Transform
- Real DFT Using the Complex DFT
- How the FFT works
- FFT Programs
- Speed and Precision Comparisons
- Further Speed Increases
13: Continuous Signal Processing
- The Delta Function
- Convolution
- The Fourier Transform
- The Fourier Series
14: Introduction to Digital Filters
- Filter Basics
- How Information is Represented in Signals
- Time Domain Parameters
- Frequency Domain Parameters
- High-Pass, Band-Pass and Band-Reject Filters
- Filter Classification
15: Moving Average Filters
- Implementation by Convolution
- Noise Reduction vs. Step Response
- Frequency Response
- Relatives of the Moving Average Filter
- Recursive Implementation
16: Windowed-Sinc Filters
- Strategy of the Windowed-Sinc
- Designing the Filter
- Examples of Windowed-Sinc Filters
- Pushing it to the Limit
17: Custom Filters
- Arbitrary Frequency Response
- Deconvolution
- Optimal Filters
18: FFT Convolution
- The Overlap-Add Method
- FFT Convolution
- Speed Improvements
19: Recursive Filters
- The Recursive Method
- Single Pole Recursive Filters
- Narrow-band Filters
- Phase Response
- Using Integers
20: Chebyshev Filters
- The Chebyshev and Butterworth Responses
- Designing the Filter
- Step Response Overshoot
- Stability
21: Filter Comparison
- Match #1: Analog vs. Digital Filters
- Match #2: Windowed-Sinc vs. Chebyshev
- Match #3: Moving Average vs. Single Pole
22: Audio Processing
- Human Hearing
- Timbre
- Sound Quality vs. Data Rate
- High Fidelity Audio
- Companding
- Speech Synthesis and Recognition
- Nonlinear Audio Processing
23: Image Formation & Display
- Digital Image Structure
- Cameras and Eyes
- Television Video Signals
- Other Image Acquisition and Display
- Brightness and Contrast Adjustments
- Grayscale Transforms
- Warping
24: Linear Image Processing
- Convolution
- 3x3 Edge Modification
- Convolution by Separability
- Example of a Large PSF: Illumination Flattening
- Fourier Image Analysis
- FFT Convolution
- A Closer Look at Image Convolution
25: Special Imaging Techniques
- Spatial Resolution
- Sample Spacing and Sampling Aperture
- Signal-to-Noise Ratio
- Morphological Image Processing
- Computed Tomography
26: Neural Networks (and more!)
- Target Detection
- Neural Network Architecture
- Why Does it Work?
- Training the Neural Network
- Evaluating the Results
- Recursive Filter Design
27: Data Compression
- Data Compression Strategies
- Run-Length Encoding
- Huffman Encoding
- Delta Encoding
- LZW Compression
- JPEG (Transform Compression)
- MPEG
28: Digital Signal Processors
- How DSPs are Different from Other Microprocessors
- Circular Buffering
- Architecture of the Digital Signal Processor
- Fixed versus Floating Point
- C versus Assembly
- How Fast are DSPs?
- The Digital Signal Processor Market
29: Getting Started with DSPs
- The ADSP-2106x family
- The SHARC EZ-KIT Lite
- Design Example: An FIR Audio Filter
- Analog Measurements on a DSP System
- Another Look at Fixed versus Floating Point
- Advanced Software Tools
30: Complex Numbers
- The Complex Number System
- Polar Notation
- Using Complex Numbers by Substitution
- Complex Representation of Sinusoids
- Complex Representation of Systems
- Electrical Circuit Analysis
31: The Complex Fourier Transform
- The Real DFT
- Mathematical Equivalence
- The Complex DFT
- The Family of Fourier Transforms
- Why the Complex Fourier Transform is Used
32: The Laplace Transform
- The Nature of the s-Domain
- Strategy of the Laplace Transform
- Analysis of Electric Circuits
- The Importance of Poles and Zeros
- Filter Design in the s-Domain
33: The z-Transform
- The Nature of the z-Domain
- Analysis of Recursive Systems
- Cascade and Parallel Stages
- Spectral Inversion
- Gain Changes
- Chebyshev-Butterworth Filter Design
- The Best and Worst of DSP
34: Explaining Benford's Law
- Frank Benford's Discovery
- Homomorphic Processing
- The Ones Scaling Test
- Writing Benford's Law as a Convolution
- Solving in the Frequency Domain
- Solving Mystery #1
- Solving Mystery #2
- More on Following Benford's law
- Analysis of the Log-Normal Distribution
- The Power of Signal Processing

How to order your own hardcover copy

Wouldn't you rather have a bound book instead of 640 loose pages?
Your laser printer will thank you!
Order from Amazon.com.

Chapter 34 - Explaining Benford's Law / More on Following Benford's law

Chapter 34: Explaining Benford's Law

More on Following Benford's law

This last result is very surprising; the mystery of Benford's law turns out to be nothing more than distribution width. Figure 34-7 demonstrates this using our previous examples. Figures (a) and (c) are the histograms of the income tax return and the RNG numbers, respectively, on the logarithmic scale. Figure (b) and (d) are their Fourier Transforms. The Benford's Law Compliance Theorem tells us that (b) will follow Benford's law very closely, while (d) will follow it very poorly. That is, PDF(f) falls to near zero before f=1 for the income tax numbers, but does not for the RNG numbers. The next step of this is less rigorous, but still perfectly clear. Figure (b) falls to zero quickly because (a) is broad. Likewise, (d) falls to zero more slowly because (c) is narrow.

This also tells us something about the magic trick. If the distribution is wide compared with unit distance on the log axis, it means that the spread in the set of numbers being examined is much greater than ten. For instance, look back at the income tax numbers shown in Fig. 34-2a. The largest numbers in this set are about a million times greater in value than the smallest numbers. This extensive spread is a key part of stamping the logarithmic pattern into the data. That is, 543,923,100 must be divided by 100,000,000 to place it between 1 and 9.99999, while 1,221 only needs to be divided by 1,000. In other words, different numbers are being treated differently, all according to an anti-logarithmic pattern.

Now look at the RNG numbers in Fig. 34-2, a group that does not obey Benford's law. The largest numbers in this set are about four times the smallest numbers (measured from -σ to +σ). That is, they are grouped relatively close together in value. When we extract the leading digits from these numbers, most of them are treated exactly the same. For instance, both 7.844026 and 1.230605 are divided by 1 to place them between 1 and 9.999999. Likewise, numbers clustered around 5,000 would all be divided by 1,000 to extract the leading digits. Since the vast majority of the numbers are being treated the same, or nearly the same, the distortion of the data is relatively weak. That is, the logarithmic pattern cannot be introduced into the data, and the magic trick fails.

How does Benford's law behave in other bases? Suppose you repeat the previous derivation in base 4 instead of base 10. The base 4 logarithmic number line is used and the Benford's Law Compliance Theorem still holds. The difference comes in when we compare the width of our test distribution with one unit of distance on the logarithmic scale. One unit of distance in base 4 is only log10(4) = 0.602 the length of one unit in base 10, making it easier for the distribution to comply with Benford's law. In terms of the magic trick, the spread in the numbers being examined only needs to be much greater than four, rather than ten. In the common case where PDF(f) smoothly decreases, Benford's law will always be followed better when converted to a lower base, and worse if converted to a higher base. For instance, the income tax numbers will not follow Benford's law if converted to base 10,000 or above (making the unit distance on the log scale four times greater). Likewise, the RND number will follow Benford's law if converted to base 2 (shortening the unit distance to log₁₀(2) = 0.301).

A note for advanced readers: You may have noticed a problem with this last statement, that is: all numbers in base 2 have a leading digit of 1. However, a more sophisticated definition of Benford's law can be used to eliminate issues of this sort. The leading digit of a number can be found by repeatedly multiplying/dividing the number by ten until it is between 1 and 9.99999, and then taking the integer portion. The advanced method stops after the first step, and directly looks at the pdf of the numbers running between 1 and 9.99999. We will call these the modified numbers. If Benford's law is being followed, a(n) = k/n, where a(n) is the probability density function of the modified numbers on the linear scale, and k is a constant providing unity area under the pdf curve. If needed for some purpose, we can find the fraction of numbers that have a leading digit of 1 by integrating a(n) from 1 to 2. Since the integral of k/n is the logarithm, if Benford's law is being followed this fraction is given by: log(2) - log(1) = 0.301. That is, we can easily move from the advanced representation to the simpler leading-digit definition.

This "k/n" form of Benford's law can be also derived from the method of Fig. 34-5. The fraction of the modified numbers that are greater than p but less than q is found by integrating a(n) between p and q. Further, this fraction will remain a constant under the scaling test if Benford's law is

being followed. However, this value is also equal to the average value of the appropriate scaling function. The logic here is the same used to show that the average value of ost(g) is equal to the average value of sf(g) in "Solving Mystery #1." These two factors become the left and right sides of the following equation, respectively:

Solving this equation results in Benford's law, i.e., a(n) = k/n.

Next Section: Analysis of the Log-Normal Distribution

The Scientist and Engineer's Guide to
Digital Signal Processing
By Steven W. Smith, Ph.D.

Book Search

Download this chapter in PDF format

Table of contents

How to order your own hardcover copy

Chapter 34: Explaining Benford's Law

The Scientist and Engineer's Guide toDigital Signal ProcessingBy Steven W. Smith, Ph.D.

Book Search

Download this chapter in PDF format

Table of contents

How to order your own hardcover copy

Chapter 34: Explaining Benford's Law

The Scientist and Engineer's Guide to
Digital Signal Processing
By Steven W. Smith, Ph.D.