Okay Declan, let’s try making this post a short and sweet update, not a rambling Homerian epic about simple stuff.
I got a Raspberry Pi (RPi) and an RPi camera because I wanted to learn about them and mess around with them. If I could do image recognition with them, that’d be a good platform to do ML, NN, and if I got enough data, maybe even DS type stuff. Luckily, there’s a ton of resources and code out there already. I drew upon heavily from www.pyimagesearch.com, which is a REALLY useful site, explained very great for beginners. Two articles that I basically copied code from and then butchered were this and this.
He’s not quite doing “image recognition” in this code, it’s more like “difference recognition”. Very simply, he has a stream of frames coming in from the camera. He starts off by taking what will be considered a “background frame”. Then, for all subsequent frames, he subtracts the background from the current frame, and then looks at the absolute difference (all done in grayscale, to make it simpler) of pixels. If two frames were identical, you’d expect very little different. If an object appeared in the new frame, the difference would show that object. Then, he uses some opencv tools to figure out where the object is, and draw a box around it.
I was able to put his code together and run it pretty quickly (though I removed some stuff like uploading it to dropbox, instead doing a kind of naive thing of sending the files via scp to my other machine), producing this gif of local traffic outside my window:
Of course, the devil is in the details. If you watch it a few times, you’ll notice some weird behavior. Most obviously, boxes are detected around the objects, but then the boxes appear to remain where the object was for several frames. Here you can see it frame by frame:
Why does this happen? Well it’s actually a smart feature, but done in a somewhat clumsy way. In his code, he has the following (I combined the few relevant snippets) inside the main frame capturing loop:
if avg is None: print("[INFO] starting background model...") avg = gray.copy().astype("float") rawCapture.truncate(0) continue cv2.accumulateWeighted(gray, avg, alpha) frameDelta = cv2.absdiff(gray, cv2.convertScaleAbs(avg))