Haystacks

In the real world, objects have different colors depending on the time of day, time of year, and any number of other environment effects. Claude Monet spent weeks painting the same haystack to illustrate and understand the variability of color and shading as seen from a single vantage point.
Certain applications of machine vision attempt to ‘correct’ for variations in color that are artifacts of the light gathering process rather than actual differences in the target being captured. On a factory floor, this is probably harmless. Another approach is to use such variations to extract additional information.
It’s my opinion (without much, if any, empirical evidence) that the z-axis of a frame can be discerned not only by stereoscopic imaging, but by simply watching the evolution of a scene from a fixed vantage point over time. The way that shadows move across the scene from the shifting sun, and the way that objects move through the frame (pedestrians, cars, birds, etc.) help quantify depth. Furthermore, as light is altered by angle and atmospheric effects, certain characteristics of a surface become evident - whether they might be painted, wooden, concrete, glass, etc. These characteristics would be assigned - one would conclude that the reflective response of a material could only mean it was vinyl, or a chromed bumper, or particular kinds of living or dead plant matter. These, in turn, could impart further information on the displacements of objects in the frame. The way leaves settle on lawns and walkways would hint, for example, that these were horizontal surfaces, and when they pile up against a fence, that a vertical surface intersected a horizontal one.
The appearance of certain objects or conditions in a frame generate implicit commands - such as a need to pull weeds, mow a lawn, or gather up leaves. Again, these would be assigned to regions of a frame - those in one’s yard, for instance, might have a different meaning from those in one’s neighbor’s.
One approach to gathering up a ‘calibration data set’ is to take a picture every six minutes for a year - 86,000 pictures. This would create a database of the object collection in a full solar cycle, which should make it possible not only to identify fixed objects, but the time of day and time of year they were observed. Interestingly enough, this would also, in certain respects, expose their location, or at least their latitude. Depending on the sensitivity of the capture during nighttime conditions, lunar and planetary inclinations would help identify longitude.
With such a data set, vision driven systems, whether robots, self-driving cars, or stationary alarms, would have a far higher probability of accurate feature extraction. Clearly this is a lot more work and a lot more storage, but it creates a reliable range of likely values rather than trying to coerce a ‘correct’ one. 32GB flash memory modules make such data sets practical - this might not have been worth attempting when such storage capacity was measured in megabytes.
- mnpoor's blog
- Login or register to post comments

