Machine Learning Goes Big Brother

New surveillance products use machine learning to automatically detect dangerous or criminal events from video and sound feeds.

I was at a the security exhibition IFSEC in Birmingham a few weeks ago as part of a project totally unrelated to machine learning. I thought. It turned out that machine learning has found its way into some rather neat applications in the security business - in particular for automatically detecting dangerous or criminal events from surveillance equipment, thereby reducing the workload for the security guards that continuously monitor a big number of video feeds.

BRS Labs, based in Houston, Texas, has developed a video analytics technology that they call behavioral recognition. This essentially means that the system automatically learns common patterns in the movements of the objects passing the scene, and whenever it detects a movement pattern that is not normal for that kind of object, it issues an alarm. Compared to traditional video analytics, which is based on rules that are set up for each camera, behavioral recognition reduces the number of false alarms and eases the installation.

Sound Intelligence, based in Amersfoort, Netherlands, has developed an audio analytics system that recognizes sounds that indicate criminality or dangerous situations, for example breaking glass, gunshots and aggressive voices. When such a sound is detected, it alerts a security guard, who can then focus his or her attention to the associated video feed. I looked a bit more into the technology, and found out that the system employs some rather interesting signal processing techniques that I hadn’t seen before. In particular, it filters out the background noise and detects the onset of a foreground sound event by continuously monitoring and modelling the background noise. The analysis is based on a graph of the sound known as a cochleogram (see example below), which mimics the information extracted from a sound by the human ear.

Cochleogram for the dutch word welkom

Cochleogram for the dutch word "welkom".

It was also interesting to notice that the system doesn't seem to employ any of the well-known, generic classification techniques found in the standard literature, such as support vector machines or hidden markov models. Instead, once all the heavy signal processing is complete, it uses rather simple recognition rules that are specialized for the different sound classes that it’s designed to recognize. For example, to detect an aggressive voice, it merely extracts some sound parameters that are known to be characteristic for aggressive voices and checks whether these parameters are above a certain threshold. I think it displays a pragmatic approach to system design: If the system doesn’t need to be generic and flexible, one can often achieve better results by tailoring the algorithms to the specific needs rather than using generic techniques.

Netværkets aktiviteter er medfinansieret af Uddannelses- og Forskningsministeriet og drives af et konsortium bestående af:
Alexandra Instituttet . BrainsBusiness . CISS . Datalogisk Institut, Københavns Universitet . DELTA . DTU Compute, Danmarks Tekniske Universitet . Institut for Datalogi, Aarhus Universitet . IT-Universitetet . Knowledge Lab, Syddansk Universitet . Væksthus Hovedstadsregionen . Aalborg Universitet