SWARM Intelligence and Big Data

Swarm technology may be thought of as insect logic.  Ants behave in the colony’s interest, but without specific guidance.  How can an animal with a brain the size of a grain of sand create a nest, find food, support a queen and expand to new areas?  Each ant is an individual point of simple intelligence, and with rules that are shared by all ants. Everything that the ant colony needs to get done, gets done.  Simplicity also extends to the ant communication, by using pheromones, the ants communicate with each other without personal contact by reviewing the chemicals left behind by another ant.  Ants are great with parallel operations, since there is no single point of control for all actions, but only one ant is empowered to reproduce, the queen.  In this way the colony is controlled, and the needs of the colony are met.

Think of a Big Data problem.  The MapReduce architecture creates multiple threads.  Simple key/value logic is applied and then shuffled to create a reduced data output that has been intelligently organized.  Each of the MapReduce nodes has simple intelligence to perform a key/value matching.  But in the world of swarm intelligence, this could be done by a multitude of agents, not just a few processors in a batch job.  What if you could release multiple agents into your database of unstructured data to look for anomalies, weed out spurious data or corrupted files, organize data by individual attributes, or identify alternate routes if there are networking problems.  Perhaps swarm intelligence could comb Facebook data to find the next mentally unstable serial killer before they strike.  The beauty is that these agents can be programmed with different simple logic, and in doing so the cost is kept low and the performance is kept high.  The simple intelligence might be programmed to include filetype, age, data protection profile, source, encryption, etc.

One premise is that swarm logic makes file structures unnecessary.  All information about the data is included in the tokens (pheromone).  It would allow the integration of structured and unstructured data.  In sort, swarm can change everything.  I won’t go so far as to say all file structures can go away.  ACID test for data coherency is important for those tasks that have one correct answer, like what was my profit last year, or how many widgets are in the warehouse.  For jobs like this, structured data and relational databases might be best.  But for many analysis jobs, swarm may be a great  improvement.

Swarm provides something that traditional data structures don’t: file intelligence.  In today’s structures we have limited space to specify a rigid set of metadata that is inflexible.  With swarm we can add more intelligence into the data using tokens with a combination of intelligence and information, and let them loose on our unstructured data to find organization, or to sort or otherwise manipulate data.