Monday, October 18, 2010

Chapter 16 - Video Surveillance; More Video Than You Ever Wanted To Know

This isn’t a how-to article on video surveillance since even my neighbor, the non-technical guy, installed his own system. Most of us have an understanding of how an IP based video surveillance network works. What we want to cover is why all this phenomenal bandwidth we are creating takes video surveillance to another level and why that may or may not be a good thing.
Video surveillance cameras used to use terms like CIF (352x288 pixel resolution) and 4CIF (702x576). Computers used resolutions like VGA (640x480) and SVGA (1024x768). The common denominator in all these is the 4x3 screen ratio. Movie makers marched to their own drums with 16:9 ratios until the standard today is the 1080 level (1920x1080).
With the integration of computers and movies, it was obvious that compression methodology needed to be applied due to the limits of CD-ROMs and bandwidth. Various JPEG and MPEG compressions were developed until MPEG-2 became the most universally used compression method for DVD. However, bandwidth and storage limitations along with increased processor power drove compression through MPEG-4 (still one of the most popular) and others to the current standard of H.264.
So how does this relate to wireless? In the past and all around the country, cameras over wireless were either a very low resolution (CIF), high-compression (blocky or blurry), or had a low frame-rate (5-10 frames per second or fps). A lot of systems with remote locations, like SCADA locations, didn’t even try to move video over the wireless system. Instead they stuck an analog recorder locally with digital output, and then only monitored 1 or 2 of the cameras at a time remotely, regardless of how many were on site, due to bandwidth limitations. Keeping in mind all the bandwidth we have proven we can deliver over wireless systems from the past few articles, the question becomes, what can we now do in the real world? 3 years ago, we deployed a video analytic system with 48 cameras across 100 square miles in North Las Vegas and Boulder City, Nevada using a Puretech PureActiv system and SkyPilot 4.9GHz mesh system, so it’s not a new concept. These cameras deliver CIF resolution at about 12fps due to storage limitations. Since multi-megapixel cameras are the next hottest thing and since I’m involved in one of these projects right, I can tell you where we are going next.
Start with the idea that multi-megapixel IP cameras are now on the market and are cost-effective. For example, a 1080i outdoor camera from Axis like the 3334 or 1755 cost around $1500 or less. There are many other products out there that are even less expensive but I would test them to make sure they can deliver the frame rates you expect under similar conditions. We found that in a couple of the less expensive cameras, they could deliver no more than 12fps even though they were rated at 30fps in that particular resolution mode.
The biggest issue is, how do you use all that video quality? For live displays, we are probably going to have to limit live viewing to CIF resolutions to get 16 cameras on a single display. With 50 cameras, you might need 5 displays, for reduced size images and one for a full size image might be one way to set it up. This kind of negates having HD video. You can use whatever variation you want from this, even if you want a whole wall of monitors. No matter what you do with a few cameras, there will be point where there are too many screens for anyone to look at simultaneously or the real-time images are too small to have value. In reality, there is no realistic way to cost effectively display and watch 50 high-resolution cameras. So where is the value?
In addition to broadcasting a 1920x1080 video stream or higher, the newer cameras can also capture video at up to 5Megapixels. That makes for some fairly impressive images and opens up all sorts of possibilities if you can get it back to a central location for processing. That’s where our big wireless pipes start having value. Imagine the camera shooting snapshot every 20 seconds to augment the high-quality video stream for forensic evidence at trial and dumping these images on a central server.
Currently most people use this much resolution for forensic use. Usually an accident is going to look the same in HD as well as CIF on video. In fact, the higher frame rate has more value than the resolution. However, the higher image quality might tell us who was driving in case there was question of that or reveal a detail such as a braking point based on a car nosing down that the lower resolution may not. In reality, most of the mega-pixel cameras can deliver both high-frame rates and HD quality.
I’m finishing our first deployment right now where all the cameras are HD quality on the fixed and 4CIF or better on the PTZ cameras (HD PTZ cameras weren’t available from Axis when we started the project). With the cameras set to 1920x1080, 20fps, 30% compression, using H.264, we are seeing about 7-8Mbps.
There are two areas where the higher resolution system has much more of an advantage. The first is in the use of forensic evidence at trial. If the subject actually has features that are discernible, then there is a higher chance of prosecution. With CIF cameras, that means either very short ranges or very small viewing areas.
The second and more important use is in the field of Video Analytics. Video Analytics uses a computer to analyze a video stream and look for specific types of activity. It basically turns video surveillance from a forensic device into a pro-active tool. Video analytics have been used in airports and depots to look for loiterers or abandoned luggage. More expensive analytic systems obviously have more features such as license plate recognition and facial recognition. Some video analytic systems can tell the emotional level of the subject or look for abhorrent behavior.
The limitation on analytics has always been resolution, processing power, and algorithms. Lower resolution can’t make out enough details for facial recognition or license plates at any distance and higher bandwidth over wireless (remember, this is a wireless series, not a wired series)has always been a challenge. At the same time, as the resolution increases, the processing power needs increase. For example, it take 4 times as much processing power to handle a 4CIF resolution video stream as it does a CIF vide stream. Expand that up to 1080HD resolution and now an older Dual-Xenon server that could handle 8 CIF streams 3 years ago can’t even handle one HD stream.
Fortunately, between Intel and the gaming industry, the answer is just right before us. Newer Intel processors using the I7 core have some pretty massive power. Jump into the Xenon version of that processor series and its running 6 cores with 6 virtual cores. Double up the Xenon processor and you have more than sufficient horsepower to do any type high-level video analytics.
Since video analytic processing isn’t any different than game processing in terms of the type of hardware needed, the gaming industry has pretty much handed us the answer. High power video cards or GPU’s (Graphic Processing Units as they are generally referred to), can be stacked to multiply the processing power. In fact, it’s possible to use 4 GPU’s in the same computer that’s capable of cracking weak AES encryption in minutes or hours. Maximum PC built a 3 card version of this exact computer. Obviously you want a different hard drive storage combination, but if the software supports the GPUs, here’s the answer.
Improved analytic engines also have the ability to do object recognition. Imagine an Amber Alert that can have every camera in the city scanning for a specific, make, model, and color of a vehicle in real-time in addition to license plates to try to find a child. All of this advanced capability requires 3 things, lots CIF cameras at very short distances for clarity, fewer cameras with very high-definition, and lots of bandwidth to get this data back to a central location. If it’s wireless, that historically has been even more difficult.
The traffic surveillance system design we used in the Town of Sahuarita was based on three things:
1)Budget
2)Capability, currently and in the future
3)System Expansion
7-8 Mbps per camera meant that there needed to be a lot of capacity. Originally the design involved 4 APs with sector antennas covering 360 degrees and up to 400Mbps or more (I told you we would get back the wireless part of the equation eventually). Although the capacity was sufficient when it was originally installed, the RF environment changed while we were finishing the system. I covered the interference issues with the local WISP in an earlier article and after my experience with Atlanta, I decided to change this design over also. With an equipment change of less than $2000, we expanded the capacity out to 800Mbps and simultaneously reduced noise figures from -75 to -92dB or better. Most lights are now PTP links to either City Hall or between each other. Since the use of highly directional antennas on the main building means my beam patterns are now 6 degrees or less, frequency reuse isn’t an issue. I haven’t used the building as my antenna isolation shield yet but that’s coming next as we add more traffic lights.
Uneven terrain also meant AP hopping wasn’t an option. Since budget was an issue and we already had some of the infrastructure in place already, we stayed with the Ubiquiti equipment. Technically this is now a combination PTP/PTMP design. I didn’t use WDS since I needed security features that won’t work with WDS on the Ubiquiti products. And because the Rockets and Nanostations cost less than $100, the highest cost would be a pole with a Rocket M5 with an MTI dual-polarity 5.8GHz flat panel antenna for about $350. However, as the deployment went in, we made some changes and are now using Powerbridges in place of the Rocket/MTI antenna combinations as they have become available. The end result of this design is that every light has an MCS(15) 2x2 MIMO link either directly back to City Hall or in a hop path between lights using the Rockets, Nanostations, and Nanostation Locos. The total cost of all the radio and antenna equipment for 13 traffic lights and 800Mbps of total capacity at City Hall will be less than $10,000 including the 2.4GHz WiFi system that went in simultaneously.
The capability of the system, although it’s still being installed, will provide some excellent prosecutorial evidence when needed. In the case of accidents, the combination of the resolution of the cameras along with the PTZ cameras that are paired with them will allow traffic and public safety the information they need to respond appropriately. In the case of a hit-and-run, the runner is going to have a much harder time getting away with high-resolution images of the vehicle and the plate, when available. If the driver leaves the vehicle, the planned video analytic software with virtual tracking with the PTZ’s are going to keep the driver, now the runner, in camera view much longer for police and give a better picture for recognition.
One other side note is that many of these cameras have audio capability. We already apply analytics to gun-shot detection and window breakage applications. Throw in some audio clues for a crash to support a video analytic rule of two objects trying to occupy the same area at the same time (crash), and false alerts drop.
There is no real growth limit to the system. On the bandwidth side, each traffic light has the capacity to hop several lights if necessary or add additional cameras. On the image side, as computer processing power continues to increase, the resolution and bandwidth is already in place to take advantage of it. This means more sophisticated surveillance tools for traffic, law enforcement, and wireless bandwidth for mobile vehicles. Video analytics are the best way to use the increased resolution an image quality that increased bandwidth capacity can provide.

No comments:

Post a Comment