Machine Learning Goes Big Brother

New surveillance products use machine learning to automatically detect dangerous or criminal events from video and sound feeds.

I was at a the security exhibition IFSEC in Birmingham a few weeks ago as part of a project totally unrelated to machine learning. I thought. It turned out that machine learning has found its way into some rather neat applications in the security business - in particular for automatically detecting dangerous or criminal events from surveillance equipment, thereby reducing the workload for the security guards that continuously monitor a big number of video feeds.

BRS Labs, based in Houston, Texas, has developed a video analytics technology that they call behavioral recognition. This essentially means that the system automatically learns common patterns in the movements of the objects passing the scene, and whenever it detects a movement pattern that is not normal for that kind of object, it issues an alarm. Compared to traditional video analytics, which is based on rules that are set up for each camera, behavioral recognition reduces the number of false alarms and eases the installation.

Sound Intelligence, based in Amersfoort, Netherlands, has developed an audio analytics system that recognizes sounds that indicate criminality or dangerous situations, for example breaking glass, gunshots and aggressive voices. When such a sound is detected, it alerts a security guard, who can then focus his or her attention to the associated video feed. I looked a bit more into the technology, and found out that the system employs some rather interesting signal processing techniques that I hadn’t seen before. In particular, it filters out the background noise and detects the onset of a foreground sound event by continuously monitoring and modelling the background noise. The analysis is based on a graph of the sound known as a cochleogram (see example below), which mimics the information extracted from a sound by the human ear.

Cochleogram for the dutch word welkom

Cochleogram for the dutch word "welkom".

It was also interesting to notice that the system doesn't seem to employ any of the well-known, generic classification techniques found in the standard literature, such as support vector machines or hidden markov models. Instead, once all the heavy signal processing is complete, it uses rather simple recognition rules that are specialized for the different sound classes that it’s designed to recognize. For example, to detect an aggressive voice, it merely extracts some sound parameters that are known to be characteristic for aggressive voices and checks whether these parameters are above a certain threshold. I think it displays a pragmatic approach to system design: If the system doesn’t need to be generic and flexible, one can often achieve better results by tailoring the algorithms to the specific needs rather than using generic techniques.

Can a Smartphone Recognize Birds?

The Alexandra Institute and DTU conducted a study on automatic recognition of bird sounds on smartphones. It was concluded that noone has yet demonstrated the recognition accuracy that one could expect from a bird recognizer app.

Would it be possible to develop a smartphone app that recognizes birds from their sounds, so that you could bring your smartphone to the forest and have it tell you what birds you’re hearing? That’s the question that I was asked to sort out a couple of years ago as the small Danish company pr-development hired us for an initial feasibility study. However, since I was rather new to machine learning at that time, I contacted Jan Larsen and Lasse Lohilahti Mølgaard at DTU (Technical University of Denmark) to help me out - they have extensive experience with machine learning on sound.

We started out with experiments on some sound data from six different bird species, all recorded with a smartphone. With some help from Lasse at DTU, I extracted 19 features for each 100 ms chunk of sound data - those were features like frequency, energy, some tonal components, etc. Below I have taken two out of these 19 features and plotted them in a two-dimensional chart, just to show how the different species dominate different regions of the feature chart - somewhat similar to how socialistic and liberal voters dominate different regions on a map of Denmark. Each cross corresponds to a 100 ms chunk, and the color indicates the species.


Feature plot for six bird speciesFeature plot for six bird species: Blue - Grasshopper Warbler; Green - Garden Warbler; Red - Blackcap; Cyan - Common Redstart; Violet - Common Blackbird; Black - Eurasian Pygmy Owl.

Even though this chart only shows two out of the 19 features (it’s difficult to produce a 19-dimensional chart), one can see that some bird species, notably Grasshopper Warbler (blue) and Eurasian Pygmy Owl (black), are clearly distinguishable, while others are more mixed up. Lasse now tried a number of different classifier models and ended up with an accuracy of 73% with the best model, which in this case turned out to be multinomial logistic regression.

But 73% on a catalog of six birds is obviously not good enough for the envisioned bird recognition app, so I went on to study some research articles to see if others had achieved better results. Indeed, Jancovic and Köküer1 achieved 91.5% accuracy on 95 birds, and Chou et al2 achieved 78% on 420 birds. However, without going into details, none of the articles studied recognition under conditions that would apply for a smartphone app, so these results aren’t really reliable for our case.

So the only thing we could conclude for sure is that if it’s at all possible to make a well-functioning bird recognizer app, then noone has yet proven it to be feasible. Indeed, if you look at the FAQ section at the home page of iBird, one of the most popular apps for bird identification, they also seem to have investigated the matter and concluded that it’s too difficult - at least for now. Thus, the question in the title remains open, and unfortunately we didn’t have the resources to conduct a full-scale study to conclude it. If anyone can contribute with some information or ideas, don’t hesitate to leave a comment!

1 Peter Jancovic, Münevver Köküer: Automatic Detection and Recognition of Tonal Bird Sounds in Noisy Environments. EURASIP Journal on Advances in Signal Processing, 2011
2 Chih-Hsun Chou, Chang-Hsing Lee, Hui-Wen Ni: Bird Species Recognition by Comparing the HMMs of the Syllables. Second International Conference on Innovative Computing, Information and Control, 2007.

Netværkets aktiviteter er medfinansieret af Uddannelses- og Forskningsministeriet og drives af et konsortium bestående af:
Alexandra Instituttet . BrainsBusiness . CISS . Datalogisk Institut, Københavns Universitet . DELTA . DTU Compute, Danmarks Tekniske Universitet . Institut for Datalogi, Aarhus Universitet . IT-Universitetet . Knowledge Lab, Syddansk Universitet . Væksthus Hovedstadsregionen . Aalborg Universitet