Better than OCR

I was starting to think about how I could determine the location of each image that I took so that I could create a SLurl for my posts. I have a few choices such as looking up information in the original post on blogHUD. Another idea popped into my head that had me thinking about experimenting with OCR. The benefit to OCR is that not only would it capture the location information at the bottom of each image, but it would also capture any other text within the photo itself – maybe.

I tried out a few OCR programs and they didn’t turn out very well. The size of the text within the images is too small, and a few letters get ruined. I need accuracy. I appears that I’ll be gathering that information from other locations anyway.

My ventures however had me thinking about similar images and locations and I started looking into image analysis. I was wondering if there were any services of software that could identify objects within a picture such as “woman”, “face”, “sunset”, “car”, etc. I haven’t had too much luck with this computer vision, but here are a few things that I found so far.

General Picture Recognition Software

Dynamic Ventures Custom Object Software Development

Import colors from an image (color schemes)

Image Generation and Shape Recognition Toolkit

PhotoSketch (now called Sketch2Photo)

Quantum Picture – Look at the Flow Control Project.

The capabilities of machine vision today seem pretty far behind. Lot’s of people on the edge of technology, but it’s not easily available to use and the accuracy is questionable. I’m still leaning towards the General Picture Recognition Software, but it costs 10 pounds – that’s before I even know if it works well enough or not.

Another option is to bring in some people through services such as TagCow. I signed up for an account and got a $5.00 complimentary credit. Depending on how well it is done, and the cost after tagging (0.02 cents per uploaded image), I may go for the whole library, depending on how hard it is to retrieve those tags and insert them into my blog. The tagcow API based on REST to post, get, and delete images as well as to get tags associated with the images. A convenient feature that I saw was a tagging result callback that pings my own web server with the tags as they are ready. There are different levels of service as well with basic (2 cents), premium (5 cents) and enterprise. Premium appears to be my target for web site search engine optimization. There is also flickr integration as well.

4 Responses to Better than OCR

  1. Graham Mills [RL: Peter Miller] says:

    Did you try Evernote? I guess there is a manual step and it isn’t instantaneous but it has an API as well.

  2. Evernote seems to be completely manual.

  3. Torrid Luna says:

    It would be handy if the SL Screenshots would just contain the slurl/time/creator info in the meta data, than having to get that out via OCR later…

  4. Peter Miller says:

    Capture is indeed manual with Evernote but text in images becomes searchable after a while (depends on how long their queue is and whether you subscribe). Thereafter it should be searchable via the API as per http://www.evernote.com/about/developer/api/ (and with the caveat that I’ve never used it or read it!) Of course, you would need a strategy for sorting out image-derived data from anything else and some means of coping with the variable length delay. My experience with the OCR has been good though in terms of simple searching.

%d bloggers like this: