Zipfian Science

Evolution, the great learning Algorithm

by Justin Hocking, 01-01-2020

659 words (4 mins)
The 21st century algorithm The five tribes Evolutionaries Why Zipfian? What we do Development What is *Zipf*ian?
#zipfian #misc

The 21st century algorithm

With the development of a global network of information sharing coupled with large data collection and Moore's law still holding, powerful new mathematical tools and algorithms have spurred the digital age to start the final evolutionary phase of humankind (pause for dramatic effect). Generally referred to as Artificial Intelligence, these algorithms are opening the door to a world of automation, new scientific discovery, social change and a plethora of technologies that will ultimately push the world past an event horizon and toward the singularity. Our greatest challenges will be finding the master algorithm that will effect this momentum.

The five tribes

In his book The Master Algorithm, Pedro Domingos outlines five tribes of machine learning,

  • Symbolists
  • Connectionists
  • Evolutionaries
  • Baysians
  • Analogizers

each of which has its own proposed learning algorithms and each having a good argument as to why theirs is best.

The 20th century favoured the symbolists and their inductive reasoning. By the end of the century baysians and analogizers upped their R & D and secured their respected places in the final race for the ultimate algorithm. Connectionists, although an old tribe itself, only had their breakthrough by the start of this century with the advent of large data sets and ever increasing computational power. That leaves the evolutionaries to prove their well deserved admittance to the final race.


Evolutionary computation, and algorithms (EA), comprises different sub fields of algorithmic approaches to simulate the processes of evolution and natural selection. Two of the most widely used and developed being Genetic Algorithms (GA) and Genetic Programming. The largest hurdle most evolutionaries have always faced is computational power. EA optimises by constructing large populations of individuals, evaluating a generation, reproducing with the fittest in the population and creating a new generation, then repeating this process until some ending criterion. This takes a lot of processing power and has thus been a contributing factor in the slow development of the field in recent decades. Increasing computational power and new advances in distributed computing are now charging new interest in the field, especially in uniting the connectionist and evolutionary tribes.

Why Zipfian?

With the imminent revival of the field, Zipfian Science was started by a group of computer scientists and engineers that have been employing EA over the last few years to optimise machine learning solutions. A predominant but certainly not an exclusive focus on EA, we are super interested in all thing computer science. That's why you'll find all kinds of software projects on this website. Interest ranges from generative algorithms to YAML config tools. Nothing short of a fun Python project will be overlooked.

What we do


At the top of our efforts list is development of EA tools and technology, with the aim of open sourcing everything we create to generate renewed interest amongst Data Scientists and researchers working in the field of machine learning. Furthermore the focus is in writing fun and useful blogs, paying our contribution as good internet citizens and sharing knowledge.

What is Zipfian?

Zipf's Law simply put:

Zipf’s law describes how the frequency of a word in natural language is dependent on its rank in the frequency table. So the most frequent word occurs twice as often as the second most frequent work, three times as often as the subsequent word, and so on until the least frequent word. The law is named after the American linguist George Kingsley Zipf, who was the first who tried to explain it around 1935. (taken from here)

When sorting a list of words by their frequency in a corpus, the first 100 most frequent words already make up 50% of the corpus. Language, though infinite, already has a very predictable pattern that can easily be modelled. It's learning the lower tails of the distribution that becomes increasingly difficult and where smarter algorithms are needed to capture deeper meaning.

© Copyright 2020 Zipfian Science

© Copyright 2018 Zipfian Science - All Rights Reserved

This website uses analytics cookies or similar technologies for reporting analytical insights. By continuing to use our website, you agree to our Privacy Policy