Finding related tags to blog posts

As you can see there are in most cases \”Related tags\” listed with each blog post. Those tags have been automatically chosen for each post. Most of them fit to the post, some do not. If you look at the post just before this one, you can see that the content is written in German. Amazingly there is e.g. a tag called \”deutsch\”. The code behind is no rocket science. All I do is looking for matching entries in the database and group by the categories found for those entries. That\’s it.

One method to refine the results could be that the content of a blog post should be tokenized, stemmed [1], and put into a FULLTEXT search query against the existing entries. That code on the other hand is not yet finished 😉

[1] http://google.ch/search?q=define:stemmer

Leave a Reply