SlideShare a Scribd company logo
Sundarapandian et al. (Eds) : CSE, CICS, DBDM, AIFL, SCOM - 2013
pp. 01–10, 2013. © CS & IT-CSCP 2013 DOI : 10.5121/csit.2013.3301
Hamming Distance and Data Compression
of 1-D CA
Raied Salman
School of Computer Information Systems
Stratford University
11104 West Broad Street
Glen Allen, VA 23060
rsalman@stratford.edu
ABSTRACT :
In this paper an application of von Neumann correction technique to the output string of some
chaotic rules of 1-D Cellular Automata that are unsuitable for cryptographic pseudo random
number generation due to their non uniform distribution of the binary elements is presented.
The one dimensional (1-D) Cellular Automata (CA) Rule space will be classified by the time run
of Hamming Distance (HD). This has the advantage of determining the rules that have short
cycle lengths and therefore deemed to be unsuitable for cryptographic pseudo random number
generation. The data collected from evolution of chaotic rules that have long cycles are
subjected to the original von Neumann density correction scheme as well as a new generalized
scheme presented in this paper and tested for statistical testing fitness using Diehard battery of
tests. Results show that significant improvement in the statistical tests are obtained when the
output of a balanced chaotic rule are mutually exclusive ORed with the output of unbalanced
chaotic rule that have undergone von Neumann density correction.
KEYWORDS :
Component; Formatting; Cellular Automata, Hamming Distance, Pseudo Random Number
Generator (PRNG)
1. INTRODUCTION
Random numbers are in large demand for such wide spectrum of applications such as
cryptography, Mont Carlo simulation, VLSI testing, etc. Pure random numbers are only obtainable
from natural sources are not suitable because they are not reproducible. Hence, pseudo random
number generation (PRNG) has been established as the best alternative. These PRNs can be
produced by mathematical formulae where repeatability is their stagnant problem albeit they
produce good statistical properties. Recently, however, cellular automata (CA) have been
attempted and proved to be quite viable with the added advantage of ease of hardware
implementation and running. One significant advantage of some of the chaotic rules of the CA is
that it can produce outputs that are amenable to mathematical representation and therefore hard to
2 Computer Science & Information Technology (CS & IT)
reproduce by the adversary, a necessary condition for cryptographic applications. The problem is
to find the suitable rule or rules out of a large size of rule space. Researchers have long sought to
classify CA rules [1-3]. A seminal and widely referenced attempt is that due to [4]. Wolfram's
classification scheme was influential, and thorough. The extensive computer simulation carried out
by Wolfram has relied heavily on the inferences drawn from phenomenological study of the space-
time diagrams of the evolution of all the
2 1
2
2
r+
rules, where r is the radius of the neighborhood of
the center cell that is being updated in discrete time steps running under Galois Field (2)GF
[2,5,6]. Although 1r = was mostly adopted in order to make the rule space practically realizable
with the availability of the computational powers of the existing computers, larger values of r
nevertheless have also been attempted mostly with genetic algorithms [7]. Some prominent
researchers have introduced ad hoc parameters in their attempts to classify the rule space [8,9].
Unfortunately, none of these methods have culminated in a well-defined classification of the CA
rule space. For a binary one dimensional (1-D) CA and a neighborhood of radius r the rule space
is
2 1
2
2
r+
. Even for an elementary 1-D CA where 1r = the rule space is reduced to 8
2 256= and still
making an exhaustive search a difficult and time consuming process. For a mere 1-bit larger
neighborhood radius 2r = produce humongous rule space of
32
2 rendering any linear search
scheme prohibitively and computationally unfeasible. One useful and statistically dependable
approach is cross correlation between two delayed versions of the evolution runs of the CA. This
research presents a new approach that can partially resolve the search problem by attempting to
use the Hamming Distance (HD) between consecutive configurations in the time evolution of the
CA and observing the cyclic behavior of this metric. This approach can in a straight forward
manner show that rules that result in a cyclic HD are actually cyclic and therefore can be decided
to be unsuitable for PRN generation. Since this operation does not require large amount of data,
the search process can be finished in a relatively very short time. It has been observed that the HD
approach can discover Wolfram’s category IV (the so called complex rules) much faster than
expected. In fact the difference between category II and category IV almost diminish. Both of
these categories as well as category I are unsuitable for PRN generation.
2. PRELIMINARIES
This paper deals with a homogeneous lattice of one dimensional cellular automata 1-D CA. The
present state of any cell at time t is denoted by ( , )t lσ where l L∈ is the spatial index of a
lattice length of L bits.
The CA can evolve using a single rule or can use multiple rules in either or both the space and
time dimension. When more than one rule is used it is usually referred to as Hybrid CA. In this
paper a single rule will be used and the CA will be referred to as a uniform 1-D CA. In order to
limit the size of the lattice cyclic boundary conditions will be applied. This means the end cells
will wrap around the lattice. If the rules deal with the center cell and the two nearest neighbors
such that the radius from the center cell to the neighboring left and right cells is 1r = the CA is
usually referred to as Elementary CA (ECA). Therefore the rule acting on cell ( ,1)tσ will
consider the left neighboring cell ( , )t Lσ and the right neighboring cell ( ,2)tσ as depicted in
Figure 1. Similarly the rule will act on the right most cell ( , )t Lσ such that the left neighboring
cell will be ( , 1)t Lσ − and the right neighboring cell will be ( ,1)tσ . The center cell at an
arbitrary location l and time t will be denoted by ( , )t lσ and the left neighboring cell as
Computer Science & Information Technology (CS & IT) 3
( , 1)t lσ − whereas the right neighboring cell will be ( , 1)t lσ + . The initial configuration will
thus be denoted by
0
{ (0,1), (0,2),..., (0, 1), (0, ), (0, 1),
..., (0, 1), (0, )} (1)
l l l
L L
σ σ σ σ σ
σ σ
Γ = − +
−
while an arbitrary configuration will be
{ ( ,1), ( ,2),..., ( , 1), ( , ), ( , 1),
..., ( , 1), ( , )} (2)
t
t t t l t l t l
t L t L
σ σ σ σ σ
σ σ
Γ = − +
−
where t T∈ and T is the total evolution time. The rule n
R where nis the rule number according
to the numbering scheme adopted by [4], is a mapping 3
:{0,1} {0,1}n
→R and the next state of
the cell under this rule can be represented by
( 1, ) : ( ( , 1), ( , ), ( , 1)) (3)t l f t l t l t lσ σ σ σ+ − +
The Hamming distance measures the distance between two binary strings by counting the number
of different bits and can be defined by
1
1
( ) ( ( , ) ( 1, )) (4)
L
t t
l
HD t t l t lσ σ+
=
Γ ⊕Γ = ⊕ +∑@
( , 2)t lσ − ( , 1)t lσ − ( , )t lσ ( , 1)t lσ + ( , 2)t lσ + ...... ..................
Figure 1. Local Rule Representation
3. SPACE RULE CLASSIFICATION
When applying the HD on an arbitrary time-space set of data two results can be extracted. One is
the transient from the initial configuration until the start of a cycle if that cycle exists within the
time evolution of the data set. The second is the length of the cycle if the cycle is captured during
the time evolution. For example the variation in the for a rule that belongs to category I
according to Wolfram’s [4] typical classification is a very short transient that terminates very
sharply to an 0. The small transient length seems to be a typical feature of category I rules, as
shown in Figure 2 for Rule 255 in both cases of random initial seed or an active center cell and
the rest of the cells are inactive. The difference in the first is of course due to the initial seed .
Category II rules, represented by Rule 1, Figure 3, exhibit a relatively longer transient but again
stabilizes at a constant which has different values depending on the initial seed . Category IV
rules, represented by Rule 35, Figure 4, again seem to exhibit similar behavior. The transient
length is again different depending on the initial seed while the asymptotically stabilizes to a
constant value. This behavior is also recurrent with category III rules, Figure 5, albeit on a larger
4 Computer Science & Information Technology (CS & IT)
scale but the main thing is that in this case it is not clear whether the is indicative of the cycle
length or whether it is a symptom of some hidden but repetitive behavior that cannot be captured
from the space-time diagram. It is a worthwhile topic for further investigation and research. This
process is simple and fast since it requires a relatively very short evolution time to produce results
that may prove to be significant in the testing of PRNs. It can be conjectured that the may be able
to be used as a fast and efficient tool for testing PRNs for suitability in cryptographic
applications. Based on the data it can also be concluded that category III rules are the best suited
for PRNs.
Figure 2. Time-Space and HD plots for Rule 255
Figure 3. Time-Space and HD plots for Rule 1
Computer Science & Information Technology (CS & IT) 5
Figure 4. Time-Space and HD plots for Rule 35
Figure 5. Time-Space and HD plots for Rule 255
6 Computer Science & Information Technology (CS & IT)
4. CA DENSITY CORRECTION AND DATA COMPRESSION
The rule space classification usually does not touch upon the density of the CA evolution. Such
metric is an essential criterion for suitability to generate cryptographically strong PRNs. Since the
rules that may be suitable for PRN generation is restricted to category III it can be seen that some
of the rules in this category do not produce uniform density. The density must be uniform such
that the number of one’s should be equal or differ by at most one bit from the number of zero’s in
the data according to Golomb’s randomness postulate number 1 [10]. Such a requirement isolates
a number of rules in category III that can possibly be considered as candidates for PRNs. For
example Rule 22 and Rule 126 both cannot produce the 0.5 uniform density but they are still
chaotic and belong to category III. The performance of such rules when tested using Diehard is
consequently very poor. If the density of these rules can be corrected then these rules can be
reconsidered for PRN generation and the repertoire of rules available for PRN generation can be
widened. Luckily there exists a very effective and yet very simple approach that is originally
attributed to von Neumann. The method effectively compresses the data according to the steps
depicted in Table 1.
Table 1 von Neumann correction Scheme
Original Data Resultant Data
01 0
10 1
11 delete
00 delete
As an example, a 1-D CA of lattice length 31L = bit was run for an evolution time of T =
2,645,000 time steps under Rule 126 produced a density of one’s equal to 0.527746. When von
Neumann reduction scheme described in Table 1 was applied on the same data the density was
corrected to 0.5. In addition this density correction is usually accompanied with two important
features in as far as PRN generation is concerned. The first is that the resultant data is now
extremely hard to reproduce, a fundamental and necessary requirement for cryptographically
strong PRNs. This is clearly due to the loss of information from both rules in the correction
process. Therefore the process can be considered as an irreversible process. The second is a
byproduct which is an improvement in the statistical properties of the rule. For this particular
example the data was tested for statistical strength by the Diehard battery of tests and passed two
tests but another test was also passed when the density was corrected. A more significant example
is Rule 30 under the same parameters passed 51 tests whereas the number of passes jumped to
129 when the data was run after the application of von Neumann correction scheme. It is very
clear from the time-space diagrams depicted in Figure 6 that Rule 126 and Rule 30 have
undergone significant randomization which were reflected the time-space diagrams as well as in
the number of passes for both rules but it was more pronounced with Rule 30 as mentioned
above.
Computer Science & Information Technology (CS & IT) 7
Figure 6. Space-Time diagrams for Rules 30 and 126
When the two rules in their uncompressed and compressed forms were linearly mixed with a
mutual exclusion operation as depicted in Figure 7 some astonishingly remarkable results were
produced as shown in Table 2. The three combinations R30 uncompressed with R126
uncompressed, R30 uncompressed with R126 compressed, R30 compressed with R126
uncompressed, produced identical results when tested with the Diehard test suite and the density
was also maintained at the favorable 0.5 level. The combination of R30 compressed with R126
uncompressed Figure 8, produced the best results and passed all the 229 Diehard tests and of
course maintained the same ideal density of 0.5. It is generally accepted that passing all the
Diehard tests is a strong indication that the PRN generator is suitable for cryptographic
applications. This is in addition to the above stated hardness in reproducing the sequence
generated. Further research and more details are deemed necessary in order to validate the initial
findings in this paper. It can also be conjectured that the other chaotic rules can produce the same
results.
Table 3 shows the variation in the Diehard test results for all the runs for Rules 30 and 126 as
well as their mixtures. It can be seen that the Overlapping Sums test number 15 and the GCD test
number 2 were the most difficult to pass except for the PRN8 (The combination of R30
compressed with R126 uncompressed) case.
Table 2. p-values and Density of Rules 30 and 126 mixtures
8 Computer Science & Information Technology (CS & IT)
Figure 7. Rules 30 and 126 mixing scheme
Table 3. Diehard Results for Rules 30 and 126
Computer Science & Information Technology (CS & IT) 9
Figure 8. Space-Time diagrams for Rules 30 and 126
5. CONCLUSIONS
In this paper the Hamming Distance was revisited and applied to the 1-D CA. The original
motivation was the classification of the rule space of the CA. This has been achieved in a very
simple and yet effective approach. The results show a well defined behavior of the chaotic rules
of category III as compared to the behavior of the rules of the other three categories. The
oscillations of the hamming distance in the transient stage are indicative of the chaotic nature of
the rule. In other words, the high value of the hamming distance in the transient stage is actually
indicative of rules Category I or II. The hamming distance values during the oscillation period do
not vary very much as is the case during the transient stage. It can be concluded that Category III
rules are the best rules suited for PRN generation. The behavior of category I rules seem to be
very clear and their time evolution reach a hamming distance equal to 0 after one or two time
steps only depending on the initial seed. Category II and IV Rules seem to behave in a similar
manner. They both reach a constant hamming distance after a very short transient cycle with a
slight difference in the values of the hamming distance during the transient cycle but the
asymptotic behavior is the same. Therefore, the new categorization of the rule space is that they
are indeed of three distinct types, Category I, Category II and IV combined, and the third is
Category III. This seems to agree with the findings of some past researchers that argued strongly
against the separate categorization of Category IV. The finding in this paper can reduce the rule
search significantly. The correlation technique that is usually used in the analysis of pseudo
random number generation can indicate the amount of correlation between two delayed versions
of the data as well as the distance between the cycles if the cycles exist. In this paper the
hamming distance is used as an alternative. The advantage of the Hamming Distance approach as
compared with the Correlation approach is that the hamming distance can show the transient
stage (the number of time steps to finish the transient orbit or system time constant) as well as
showing the cycles with clear repetition a feature that the correlation technique is unable to
produce. In addition the hamming distance can arrive at the results in a very short time while the
cross correlation technique requires the full length of the data and much more computational
effort.
10 Computer Science & Information Technology (CS & IT)
It is also clear from the results in this paper and the findings of previous research that not all rules
of Category III are suitable for PRN generation. One stagnant problem with the rules that are
deemed unsuitable is attributed to the non-uniform density output of some of these rules, such as
Rule 126. The application of von Neumann reduction scheme proved to be beneficial. The density
has been corrected to the desirable value of 0.5. However, a byproduct to this was the
improvement in the randomization as depicted in the images produced which was also validated
in the increase of test passes. A more significant improvement was in the number of tests passed
by Rule 30 that jumped from 51 prior to the application of the reduction scheme to 129 after the
application of the scheme. Another remarkable result was achieved when the two types of rules
R126 and R30 were linearly mixed together. When a reduced output of R30 was mutually
exclusive ORed with the output of unreduced output of R126, the output data has passed all the
229 Diehard tests. A result that is extremely difficult to achieve by other PRN sources. This result
may require further effort to validate the findings in this paper as well show that the approach is
equally applicable to the other chaotic rules.
REFERENCES
[1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving
products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955.
(references)
[2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892,
pp.68–73.
[3] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol.
III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.
[4] K. Elissa, “Title of paper if known,” unpublished.
[5] R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press.
[6] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical
media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987
[Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
[7] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.

More Related Content

PDF
Jmessp13420104
Nertila Ismailaja
 
PDF
IFAC2008art
Yuri Kim
 
PDF
Probability-Based Analysis to Determine the Performance of Multilevel Feedbac...
Eswar Publications
 
PDF
COCOON14
Yuan Tang
 
PDF
A Modular approach on Statistical Randomness Study of bit sequences
Eswar Publications
 
DOC
Chronological Decomposition Heuristic: A Temporal Divide-and-Conquer Strateg...
Alkis Vazacopoulos
 
PDF
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHM
csitconf
 
Jmessp13420104
Nertila Ismailaja
 
IFAC2008art
Yuri Kim
 
Probability-Based Analysis to Determine the Performance of Multilevel Feedbac...
Eswar Publications
 
COCOON14
Yuan Tang
 
A Modular approach on Statistical Randomness Study of bit sequences
Eswar Publications
 
Chronological Decomposition Heuristic: A Temporal Divide-and-Conquer Strateg...
Alkis Vazacopoulos
 
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHM
csitconf
 

What's hot (19)

PDF
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHM
cscpconf
 
PDF
Analysis of single server fixed batch service queueing system under multiple ...
Alexander Decker
 
PPTX
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
Waqas Nawaz
 
DOC
Ds lesson plan
AVC College of Engineering
 
PDF
On selection of periodic kernels parameters in time series prediction
csandit
 
PDF
Ojchd.000546
crimsonpublishersOJCHD
 
PDF
Parellelism in spectral methods
Ramona Corman
 
PDF
Box-fitting algorithm presentation
Ridlo Wibowo
 
PPTX
Biological sequences analysis
Davide Andrea Guastella
 
PDF
On Projected Newton Barrier Methods for Linear Programming and an Equivalence...
SSA KPI
 
PDF
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
PDF
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
ijdpsjournal
 
PDF
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
AI Robotics KR
 
PDF
Design of ternary sequence using msaa
Editor Jacotech
 
PDF
New Data Association Technique for Target Tracking in Dense Clutter Environme...
CSCJournals
 
PPT
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Deb Aks
 
PDF
poster
Ran Zhou
 
PDF
Quantum algorithm for solving linear systems of equations
XequeMateShannon
 
PDF
NNPDF3.0: parton distributions for the LHC Run II
juanrojochacon
 
COMPUTATIONAL PERFORMANCE OF QUANTUM PHASE ESTIMATION ALGORITHM
cscpconf
 
Analysis of single server fixed batch service queueing system under multiple ...
Alexander Decker
 
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
Waqas Nawaz
 
On selection of periodic kernels parameters in time series prediction
csandit
 
Parellelism in spectral methods
Ramona Corman
 
Box-fitting algorithm presentation
Ridlo Wibowo
 
Biological sequences analysis
Davide Andrea Guastella
 
On Projected Newton Barrier Methods for Linear Programming and an Equivalence...
SSA KPI
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
AN EFFICIENT PARALLEL ALGORITHM FOR COMPUTING DETERMINANT OF NON-SQUARE MATRI...
ijdpsjournal
 
Sensor Fusion Study - Ch3. Least Square Estimation [강소라, Stella, Hayden]
AI Robotics KR
 
Design of ternary sequence using msaa
Editor Jacotech
 
New Data Association Technique for Target Tracking in Dense Clutter Environme...
CSCJournals
 
Recurrence Quantification Analysis : Tutorial & application to eye-movement data
Deb Aks
 
poster
Ran Zhou
 
Quantum algorithm for solving linear systems of equations
XequeMateShannon
 
NNPDF3.0: parton distributions for the LHC Run II
juanrojochacon
 
Ad

Viewers also liked (18)

PPTX
Error Detection and Correction - Data link Layer
Abdullaziz Tagawy
 
PDF
Approximation Algorithms Part Two: More Constant factor approximations
Benjamin Sach
 
PDF
Pattern Matching Part Three: Hamming Distance
Benjamin Sach
 
PDF
Approximation Algorithms Part Four: APTAS
Benjamin Sach
 
PPT
Ch10
Lia Nakid
 
PPTX
Error Detection N Correction
Ankan Adhikari
 
PPTX
Data compression techniques
Deep Bhatt
 
PDF
Error detection and correction
Sisir Ghosh
 
PDF
Data compression huffman coding algoritham
Rahul Khanwani
 
PPT
Error Correction And Hamming Code Ibrar
ibrar562
 
PPTX
Parity check(Error Detecting Codes)
Imesha Perera
 
PPT
Huffman Coding
anithabalaprabhu
 
PPT
Errror Detection and Correction
Mahesh Kumar Attri
 
PPT
Error detection and correction
Siddique Ibrahim
 
PPT
Error control, parity check, check sum, vrc
Huawei Technologies
 
PPTX
FEC-Forward Error Correction for Optics Professionals..www.mapyourtech.com
MapYourTech
 
PPT
Error Detection And Correction
Renu Kewalramani
 
PPTX
Computer Networks - Error Detection & Error Correction
Saikrishna Tanguturu
 
Error Detection and Correction - Data link Layer
Abdullaziz Tagawy
 
Approximation Algorithms Part Two: More Constant factor approximations
Benjamin Sach
 
Pattern Matching Part Three: Hamming Distance
Benjamin Sach
 
Approximation Algorithms Part Four: APTAS
Benjamin Sach
 
Ch10
Lia Nakid
 
Error Detection N Correction
Ankan Adhikari
 
Data compression techniques
Deep Bhatt
 
Error detection and correction
Sisir Ghosh
 
Data compression huffman coding algoritham
Rahul Khanwani
 
Error Correction And Hamming Code Ibrar
ibrar562
 
Parity check(Error Detecting Codes)
Imesha Perera
 
Huffman Coding
anithabalaprabhu
 
Errror Detection and Correction
Mahesh Kumar Attri
 
Error detection and correction
Siddique Ibrahim
 
Error control, parity check, check sum, vrc
Huawei Technologies
 
FEC-Forward Error Correction for Optics Professionals..www.mapyourtech.com
MapYourTech
 
Error Detection And Correction
Renu Kewalramani
 
Computer Networks - Error Detection & Error Correction
Saikrishna Tanguturu
 
Ad

Similar to Hamming Distance and Data Compression of 1-D CA (20)

PDF
Dce a novel delay correlation
ijdpsjournal
 
PDF
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
ijdpsjournal
 
PDF
Entropy 12-02268-v2
CAA Sudan
 
PDF
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA BOUNDARY CONDITIONS
ijcsit
 
PDF
Choice of Numerical Integration Method for Wind Time History Analysis of Tall...
inventy
 
PDF
Pakdd
Siswanto .
 
PDF
Ill-posedness formulation of the emission source localization in the radio- d...
Ahmed Ammar Rebai PhD
 
PDF
Time alignment techniques for experimental sensor data
IJCSES Journal
 
PDF
Joint Timing and Frequency Synchronization in OFDM
idescitation
 
PDF
A QUANTITATIVE ANALYSIS OF HANDOVER TIME AT MAC LAYER FOR WIRELESS MOBILE NET...
ijwmn
 
PDF
A Hybrid Deep Neural Network Model For Time Series Forecasting
Martha Brown
 
PDF
Acl Optimisation - Computer Networks
Maria Almudena Garcia-Fraile Fraile
 
PDF
solver (1)
Raj Mitra
 
PDF
Chaos in Small-World Networks
Xin-She Yang
 
PDF
Tutorial marzo2011 villen
Manuel Villen Altamirano
 
PDF
Oscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves
 
PDF
Design and analysis of a model predictive controller for active queue management
ISA Interchange
 
PDF
ARIMA Models - [Lab 3]
Theodore Grammatikopoulos
 
PDF
Smoothed Particle Hydrodynamics
ナム-Nam Nguyễn
 
PDF
PERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHM
IJCNCJournal
 
Dce a novel delay correlation
ijdpsjournal
 
DCE: A NOVEL DELAY CORRELATION MEASUREMENT FOR TOMOGRAPHY WITH PASSIVE REAL...
ijdpsjournal
 
Entropy 12-02268-v2
CAA Sudan
 
ANALYSIS OF ELEMENTARY CELLULAR AUTOMATA BOUNDARY CONDITIONS
ijcsit
 
Choice of Numerical Integration Method for Wind Time History Analysis of Tall...
inventy
 
Pakdd
Siswanto .
 
Ill-posedness formulation of the emission source localization in the radio- d...
Ahmed Ammar Rebai PhD
 
Time alignment techniques for experimental sensor data
IJCSES Journal
 
Joint Timing and Frequency Synchronization in OFDM
idescitation
 
A QUANTITATIVE ANALYSIS OF HANDOVER TIME AT MAC LAYER FOR WIRELESS MOBILE NET...
ijwmn
 
A Hybrid Deep Neural Network Model For Time Series Forecasting
Martha Brown
 
Acl Optimisation - Computer Networks
Maria Almudena Garcia-Fraile Fraile
 
solver (1)
Raj Mitra
 
Chaos in Small-World Networks
Xin-She Yang
 
Tutorial marzo2011 villen
Manuel Villen Altamirano
 
Oscar Nieves (11710858) Computational Physics Project - Inverted Pendulum
Oscar Nieves
 
Design and analysis of a model predictive controller for active queue management
ISA Interchange
 
ARIMA Models - [Lab 3]
Theodore Grammatikopoulos
 
Smoothed Particle Hydrodynamics
ナム-Nam Nguyễn
 
PERFORMANCE AND COMPLEXITY ANALYSIS OF A REDUCED ITERATIONS LLL ALGORITHM
IJCNCJournal
 

More from csitconf (8)

PDF
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
csitconf
 
PDF
EDGE DETECTION IN RADAR IMAGES USING WEIBULL DISTRIBUTION
csitconf
 
PDF
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
csitconf
 
PDF
IRIS BIOMETRIC RECOGNITION SYSTEM EMPLOYING CANNY OPERATOR
csitconf
 
PDF
PLANNING BY CASE-BASED REASONING BASED ON FUZZY LOGIC
csitconf
 
PDF
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
csitconf
 
PDF
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
csitconf
 
PDF
FEEDBACK SHIFT REGISTERS AS CELLULAR AUTOMATA BOUNDARY CONDITIONS
csitconf
 
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
csitconf
 
EDGE DETECTION IN RADAR IMAGES USING WEIBULL DISTRIBUTION
csitconf
 
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
csitconf
 
IRIS BIOMETRIC RECOGNITION SYSTEM EMPLOYING CANNY OPERATOR
csitconf
 
PLANNING BY CASE-BASED REASONING BASED ON FUZZY LOGIC
csitconf
 
SUPERVISED FEATURE SELECTION FOR DIAGNOSIS OF CORONARY ARTERY DISEASE BASED O...
csitconf
 
NEURAL NETWORKS WITH DECISION TREES FOR DIAGNOSIS ISSUES
csitconf
 
FEEDBACK SHIFT REGISTERS AS CELLULAR AUTOMATA BOUNDARY CONDITIONS
csitconf
 

Recently uploaded (20)

PDF
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
PPTX
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
PDF
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
Doc9.....................................
SofiaCollazos
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
A Strategic Analysis of the MVNO Wave in Emerging Markets.pdf
IPLOOK Networks
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Get More from Fiori Automation - What’s New, What Works, and What’s Next.pdf
Precisely
 
AI in Daily Life: How Artificial Intelligence Helps Us Every Day
vanshrpatil7
 
Automating ArcGIS Content Discovery with FME: A Real World Use Case
Safe Software
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
Doc9.....................................
SofiaCollazos
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
NewMind AI Weekly Chronicles - July'25 - Week IV
NewMind AI
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 

Hamming Distance and Data Compression of 1-D CA

  • 1. Sundarapandian et al. (Eds) : CSE, CICS, DBDM, AIFL, SCOM - 2013 pp. 01–10, 2013. © CS & IT-CSCP 2013 DOI : 10.5121/csit.2013.3301 Hamming Distance and Data Compression of 1-D CA Raied Salman School of Computer Information Systems Stratford University 11104 West Broad Street Glen Allen, VA 23060 [email protected] ABSTRACT : In this paper an application of von Neumann correction technique to the output string of some chaotic rules of 1-D Cellular Automata that are unsuitable for cryptographic pseudo random number generation due to their non uniform distribution of the binary elements is presented. The one dimensional (1-D) Cellular Automata (CA) Rule space will be classified by the time run of Hamming Distance (HD). This has the advantage of determining the rules that have short cycle lengths and therefore deemed to be unsuitable for cryptographic pseudo random number generation. The data collected from evolution of chaotic rules that have long cycles are subjected to the original von Neumann density correction scheme as well as a new generalized scheme presented in this paper and tested for statistical testing fitness using Diehard battery of tests. Results show that significant improvement in the statistical tests are obtained when the output of a balanced chaotic rule are mutually exclusive ORed with the output of unbalanced chaotic rule that have undergone von Neumann density correction. KEYWORDS : Component; Formatting; Cellular Automata, Hamming Distance, Pseudo Random Number Generator (PRNG) 1. INTRODUCTION Random numbers are in large demand for such wide spectrum of applications such as cryptography, Mont Carlo simulation, VLSI testing, etc. Pure random numbers are only obtainable from natural sources are not suitable because they are not reproducible. Hence, pseudo random number generation (PRNG) has been established as the best alternative. These PRNs can be produced by mathematical formulae where repeatability is their stagnant problem albeit they produce good statistical properties. Recently, however, cellular automata (CA) have been attempted and proved to be quite viable with the added advantage of ease of hardware implementation and running. One significant advantage of some of the chaotic rules of the CA is that it can produce outputs that are amenable to mathematical representation and therefore hard to
  • 2. 2 Computer Science & Information Technology (CS & IT) reproduce by the adversary, a necessary condition for cryptographic applications. The problem is to find the suitable rule or rules out of a large size of rule space. Researchers have long sought to classify CA rules [1-3]. A seminal and widely referenced attempt is that due to [4]. Wolfram's classification scheme was influential, and thorough. The extensive computer simulation carried out by Wolfram has relied heavily on the inferences drawn from phenomenological study of the space- time diagrams of the evolution of all the 2 1 2 2 r+ rules, where r is the radius of the neighborhood of the center cell that is being updated in discrete time steps running under Galois Field (2)GF [2,5,6]. Although 1r = was mostly adopted in order to make the rule space practically realizable with the availability of the computational powers of the existing computers, larger values of r nevertheless have also been attempted mostly with genetic algorithms [7]. Some prominent researchers have introduced ad hoc parameters in their attempts to classify the rule space [8,9]. Unfortunately, none of these methods have culminated in a well-defined classification of the CA rule space. For a binary one dimensional (1-D) CA and a neighborhood of radius r the rule space is 2 1 2 2 r+ . Even for an elementary 1-D CA where 1r = the rule space is reduced to 8 2 256= and still making an exhaustive search a difficult and time consuming process. For a mere 1-bit larger neighborhood radius 2r = produce humongous rule space of 32 2 rendering any linear search scheme prohibitively and computationally unfeasible. One useful and statistically dependable approach is cross correlation between two delayed versions of the evolution runs of the CA. This research presents a new approach that can partially resolve the search problem by attempting to use the Hamming Distance (HD) between consecutive configurations in the time evolution of the CA and observing the cyclic behavior of this metric. This approach can in a straight forward manner show that rules that result in a cyclic HD are actually cyclic and therefore can be decided to be unsuitable for PRN generation. Since this operation does not require large amount of data, the search process can be finished in a relatively very short time. It has been observed that the HD approach can discover Wolfram’s category IV (the so called complex rules) much faster than expected. In fact the difference between category II and category IV almost diminish. Both of these categories as well as category I are unsuitable for PRN generation. 2. PRELIMINARIES This paper deals with a homogeneous lattice of one dimensional cellular automata 1-D CA. The present state of any cell at time t is denoted by ( , )t lσ where l L∈ is the spatial index of a lattice length of L bits. The CA can evolve using a single rule or can use multiple rules in either or both the space and time dimension. When more than one rule is used it is usually referred to as Hybrid CA. In this paper a single rule will be used and the CA will be referred to as a uniform 1-D CA. In order to limit the size of the lattice cyclic boundary conditions will be applied. This means the end cells will wrap around the lattice. If the rules deal with the center cell and the two nearest neighbors such that the radius from the center cell to the neighboring left and right cells is 1r = the CA is usually referred to as Elementary CA (ECA). Therefore the rule acting on cell ( ,1)tσ will consider the left neighboring cell ( , )t Lσ and the right neighboring cell ( ,2)tσ as depicted in Figure 1. Similarly the rule will act on the right most cell ( , )t Lσ such that the left neighboring cell will be ( , 1)t Lσ − and the right neighboring cell will be ( ,1)tσ . The center cell at an arbitrary location l and time t will be denoted by ( , )t lσ and the left neighboring cell as
  • 3. Computer Science & Information Technology (CS & IT) 3 ( , 1)t lσ − whereas the right neighboring cell will be ( , 1)t lσ + . The initial configuration will thus be denoted by 0 { (0,1), (0,2),..., (0, 1), (0, ), (0, 1), ..., (0, 1), (0, )} (1) l l l L L σ σ σ σ σ σ σ Γ = − + − while an arbitrary configuration will be { ( ,1), ( ,2),..., ( , 1), ( , ), ( , 1), ..., ( , 1), ( , )} (2) t t t t l t l t l t L t L σ σ σ σ σ σ σ Γ = − + − where t T∈ and T is the total evolution time. The rule n R where nis the rule number according to the numbering scheme adopted by [4], is a mapping 3 :{0,1} {0,1}n →R and the next state of the cell under this rule can be represented by ( 1, ) : ( ( , 1), ( , ), ( , 1)) (3)t l f t l t l t lσ σ σ σ+ − + The Hamming distance measures the distance between two binary strings by counting the number of different bits and can be defined by 1 1 ( ) ( ( , ) ( 1, )) (4) L t t l HD t t l t lσ σ+ = Γ ⊕Γ = ⊕ +∑@ ( , 2)t lσ − ( , 1)t lσ − ( , )t lσ ( , 1)t lσ + ( , 2)t lσ + ...... .................. Figure 1. Local Rule Representation 3. SPACE RULE CLASSIFICATION When applying the HD on an arbitrary time-space set of data two results can be extracted. One is the transient from the initial configuration until the start of a cycle if that cycle exists within the time evolution of the data set. The second is the length of the cycle if the cycle is captured during the time evolution. For example the variation in the for a rule that belongs to category I according to Wolfram’s [4] typical classification is a very short transient that terminates very sharply to an 0. The small transient length seems to be a typical feature of category I rules, as shown in Figure 2 for Rule 255 in both cases of random initial seed or an active center cell and the rest of the cells are inactive. The difference in the first is of course due to the initial seed . Category II rules, represented by Rule 1, Figure 3, exhibit a relatively longer transient but again stabilizes at a constant which has different values depending on the initial seed . Category IV rules, represented by Rule 35, Figure 4, again seem to exhibit similar behavior. The transient length is again different depending on the initial seed while the asymptotically stabilizes to a constant value. This behavior is also recurrent with category III rules, Figure 5, albeit on a larger
  • 4. 4 Computer Science & Information Technology (CS & IT) scale but the main thing is that in this case it is not clear whether the is indicative of the cycle length or whether it is a symptom of some hidden but repetitive behavior that cannot be captured from the space-time diagram. It is a worthwhile topic for further investigation and research. This process is simple and fast since it requires a relatively very short evolution time to produce results that may prove to be significant in the testing of PRNs. It can be conjectured that the may be able to be used as a fast and efficient tool for testing PRNs for suitability in cryptographic applications. Based on the data it can also be concluded that category III rules are the best suited for PRNs. Figure 2. Time-Space and HD plots for Rule 255 Figure 3. Time-Space and HD plots for Rule 1
  • 5. Computer Science & Information Technology (CS & IT) 5 Figure 4. Time-Space and HD plots for Rule 35 Figure 5. Time-Space and HD plots for Rule 255
  • 6. 6 Computer Science & Information Technology (CS & IT) 4. CA DENSITY CORRECTION AND DATA COMPRESSION The rule space classification usually does not touch upon the density of the CA evolution. Such metric is an essential criterion for suitability to generate cryptographically strong PRNs. Since the rules that may be suitable for PRN generation is restricted to category III it can be seen that some of the rules in this category do not produce uniform density. The density must be uniform such that the number of one’s should be equal or differ by at most one bit from the number of zero’s in the data according to Golomb’s randomness postulate number 1 [10]. Such a requirement isolates a number of rules in category III that can possibly be considered as candidates for PRNs. For example Rule 22 and Rule 126 both cannot produce the 0.5 uniform density but they are still chaotic and belong to category III. The performance of such rules when tested using Diehard is consequently very poor. If the density of these rules can be corrected then these rules can be reconsidered for PRN generation and the repertoire of rules available for PRN generation can be widened. Luckily there exists a very effective and yet very simple approach that is originally attributed to von Neumann. The method effectively compresses the data according to the steps depicted in Table 1. Table 1 von Neumann correction Scheme Original Data Resultant Data 01 0 10 1 11 delete 00 delete As an example, a 1-D CA of lattice length 31L = bit was run for an evolution time of T = 2,645,000 time steps under Rule 126 produced a density of one’s equal to 0.527746. When von Neumann reduction scheme described in Table 1 was applied on the same data the density was corrected to 0.5. In addition this density correction is usually accompanied with two important features in as far as PRN generation is concerned. The first is that the resultant data is now extremely hard to reproduce, a fundamental and necessary requirement for cryptographically strong PRNs. This is clearly due to the loss of information from both rules in the correction process. Therefore the process can be considered as an irreversible process. The second is a byproduct which is an improvement in the statistical properties of the rule. For this particular example the data was tested for statistical strength by the Diehard battery of tests and passed two tests but another test was also passed when the density was corrected. A more significant example is Rule 30 under the same parameters passed 51 tests whereas the number of passes jumped to 129 when the data was run after the application of von Neumann correction scheme. It is very clear from the time-space diagrams depicted in Figure 6 that Rule 126 and Rule 30 have undergone significant randomization which were reflected the time-space diagrams as well as in the number of passes for both rules but it was more pronounced with Rule 30 as mentioned above.
  • 7. Computer Science & Information Technology (CS & IT) 7 Figure 6. Space-Time diagrams for Rules 30 and 126 When the two rules in their uncompressed and compressed forms were linearly mixed with a mutual exclusion operation as depicted in Figure 7 some astonishingly remarkable results were produced as shown in Table 2. The three combinations R30 uncompressed with R126 uncompressed, R30 uncompressed with R126 compressed, R30 compressed with R126 uncompressed, produced identical results when tested with the Diehard test suite and the density was also maintained at the favorable 0.5 level. The combination of R30 compressed with R126 uncompressed Figure 8, produced the best results and passed all the 229 Diehard tests and of course maintained the same ideal density of 0.5. It is generally accepted that passing all the Diehard tests is a strong indication that the PRN generator is suitable for cryptographic applications. This is in addition to the above stated hardness in reproducing the sequence generated. Further research and more details are deemed necessary in order to validate the initial findings in this paper. It can also be conjectured that the other chaotic rules can produce the same results. Table 3 shows the variation in the Diehard test results for all the runs for Rules 30 and 126 as well as their mixtures. It can be seen that the Overlapping Sums test number 15 and the GCD test number 2 were the most difficult to pass except for the PRN8 (The combination of R30 compressed with R126 uncompressed) case. Table 2. p-values and Density of Rules 30 and 126 mixtures
  • 8. 8 Computer Science & Information Technology (CS & IT) Figure 7. Rules 30 and 126 mixing scheme Table 3. Diehard Results for Rules 30 and 126
  • 9. Computer Science & Information Technology (CS & IT) 9 Figure 8. Space-Time diagrams for Rules 30 and 126 5. CONCLUSIONS In this paper the Hamming Distance was revisited and applied to the 1-D CA. The original motivation was the classification of the rule space of the CA. This has been achieved in a very simple and yet effective approach. The results show a well defined behavior of the chaotic rules of category III as compared to the behavior of the rules of the other three categories. The oscillations of the hamming distance in the transient stage are indicative of the chaotic nature of the rule. In other words, the high value of the hamming distance in the transient stage is actually indicative of rules Category I or II. The hamming distance values during the oscillation period do not vary very much as is the case during the transient stage. It can be concluded that Category III rules are the best rules suited for PRN generation. The behavior of category I rules seem to be very clear and their time evolution reach a hamming distance equal to 0 after one or two time steps only depending on the initial seed. Category II and IV Rules seem to behave in a similar manner. They both reach a constant hamming distance after a very short transient cycle with a slight difference in the values of the hamming distance during the transient cycle but the asymptotic behavior is the same. Therefore, the new categorization of the rule space is that they are indeed of three distinct types, Category I, Category II and IV combined, and the third is Category III. This seems to agree with the findings of some past researchers that argued strongly against the separate categorization of Category IV. The finding in this paper can reduce the rule search significantly. The correlation technique that is usually used in the analysis of pseudo random number generation can indicate the amount of correlation between two delayed versions of the data as well as the distance between the cycles if the cycles exist. In this paper the hamming distance is used as an alternative. The advantage of the Hamming Distance approach as compared with the Correlation approach is that the hamming distance can show the transient stage (the number of time steps to finish the transient orbit or system time constant) as well as showing the cycles with clear repetition a feature that the correlation technique is unable to produce. In addition the hamming distance can arrive at the results in a very short time while the cross correlation technique requires the full length of the data and much more computational effort.
  • 10. 10 Computer Science & Information Technology (CS & IT) It is also clear from the results in this paper and the findings of previous research that not all rules of Category III are suitable for PRN generation. One stagnant problem with the rules that are deemed unsuitable is attributed to the non-uniform density output of some of these rules, such as Rule 126. The application of von Neumann reduction scheme proved to be beneficial. The density has been corrected to the desirable value of 0.5. However, a byproduct to this was the improvement in the randomization as depicted in the images produced which was also validated in the increase of test passes. A more significant improvement was in the number of tests passed by Rule 30 that jumped from 51 prior to the application of the reduction scheme to 129 after the application of the scheme. Another remarkable result was achieved when the two types of rules R126 and R30 were linearly mixed together. When a reduced output of R30 was mutually exclusive ORed with the output of unreduced output of R126, the output data has passed all the 229 Diehard tests. A result that is extremely difficult to achieve by other PRN sources. This result may require further effort to validate the findings in this paper as well show that the approach is equally applicable to the other chaotic rules. REFERENCES [1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955. (references) [2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73. [3] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350. [4] K. Elissa, “Title of paper if known,” unpublished. [5] R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press. [6] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982]. [7] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 1989.