Wednesday, December 14, 2005

A tag based file system (with Bayesian Auto-tagging)

So having been foiled by OmniDrive in my desire to create an Internet based virtual drive, I've moved on other, perhaps greener and defintely less populated pastures.

If you'll remember, I spoke about adding tagging support to the virtual drive I was talking about. I said that we could "... add tagging support to the virtual drive. You can tag files and folders and view virtual 'tag' folders with links to those files. Mainstream OS's don't have a tagging mechanism for files, so we'll have to add meta-data through file names. e.g. end file names with a special character and the tags (i.e. myfile.txt#work,proposal,text) which will be stripped off before being saved to the virtual drive."

I won't be working on the 'Internet' part of the virtual drive, but I can certainly implement this idea. Why not create a FUSE based file system with tag based virtual folders? Use either folder names (/mnt/tagfs/here:are:some tags/myfile.txt) or file names (/home/user/myfile.txt:here:are:some tags) to add tags and then use virtual tag directories to navigate through those tags? You can use mv to change the tags associated with a file etc.

In addition, we can have a Bayesian Auto-tagger which learns which tags you've previously used for files of a certain type and then automatically tags them appropriately if no tags are supplied. The more you tag, the better the auto-tagger.

Who knows, I might decide to run with this one! :-)

The saga continues...

Relevant links:

5 comments:

Unknown said...

Have you followed up on this at all? I've always thought that a tag based file system would be a lot better than the folder based ones. Could this could be the first one?

Classificity Classifieds Classifier said...

Bump--any progress? other systems?
I am working on Classificity, a site to classify classified ads during a purchase search.

Fernando Rosa said...

Was there any update on this? This really seems like a great idea.

DM42 said...

I've implemented something similar to this (no auto-tagging) for organizing a repository of files/documents/etc.

You can find more info at the Google Code repository I set up for the project. It's not under "active" development, but if there are questions/problems/concerns, drop a note.

DM42 said...

Oops... thought I included the link... sorry - http://code.google.com/p/htaggingolfs/