I was inspired (by really really great person) to contribute my master thesis to global www heap. If it may actually help even one person then it is more then worth it. My work was finished two years ago, send to a drawer, but maybe someone can use this in process of creating something valuable? I do regret only one thing, it was written in Polish, so it might be an obstacle, but I’m willing to answer any question if someone gets really interested.
Main idea of my master thesis was to answer a question how to help vision impaired people in environments that haven’t been thought about them in a first place. Since I was always more practically, then theoretically, oriented I decided to create an actual software prototype that can be run in embedded devices (my platform of choice was signal processor Blackfin from Analog Devices). System consisted of camera to gather live data, signal processor to process it and software to decided weather something important for blind person is in front of him. Warnings or information could be send to this person in form of voice commands.
I decided to check two tracks – first one was using designed for Augmented Reality tags – called ARTags (if you haven’t seen what Augmented Reality is – I do truly encourage you to see some demos on youtube – possibilities of this idea are just mind bubbling). Main goal was to put ARTags on some important obstacles (like doors, near stairs etc), and make my software recognize them in real time. As signal processors are designed in terms of power efficiency my software was using library specifically written for small devices (like PDAs, smartphones etc). It is called ARToolkitPlus, and I have ported it for Analog Devices environment needs. Source codes on request.
My second idea wasn’t really explored before (or at least Google said so). I was thinking how do describe an image in a way that we can actually tell that in front of us is some specific shape or obstacle. My weapon of choice were Visual Descriptors from MPEG-7 standard. I checked two - Color Layout and Edge Histogram. As first was really very tightly tied to specific location (different buildings – different colors), and prone to poor light conditions, the latter gave surprisingly quite good results. Edge Histogram can extract edge distribution in an image, it means it can actually describe shapes of some specific locations. Stairs happened to be very good subject, as their recognition level was very high. As a matter of fact they are a big threat for vision impaired people and I think universal warning system could find a real life usage. How it worked ? Well at first I made pictures of different stairs locations and “described” them using Visual Descriptors. By that we gained a set of attributes with whom we can compare live pictures gathered by camera working as a blind man eyes. Every image captured by signal processors is described using this same descriptors and then compared with database of previously tagged shapes.
As I mentioned at the beginning of this article I would be very happy if this information can be useful to somebody, instead of laying and catching dust on my hard drive. Maybe you should share your work with world as well, who know what combined knowledge of us all can create ?
Presentation about my master thesis
My master thesis