A few weeks ago, I wrote an overview to the semantic web – and how Google’s Rich Snippets feature uses it. Now it’s time to dive in a little deeper – take your semantic scuba gear along – and discover the differences between different types of semantic markup.
Any type of semantic markup needs to have a vocabulary defined somewhere on the internet – so people can learn what the terms are, and sometimes as an essential part of a computer figuring out what the terms are.
Google officially supports microformats and RDFa:
Microformats
Microformats are a set of vocabularies developed and hosted on microformats.org. Any interested party can contribute to the development of new and existing vocabularies. Some of the vocabularies are stable and established and some are drafts and are liable to change. (Interestingly enough, Google supports the hCard vocabulary for defining People and Organization, which is an established vocabulary, and the hReview vocabulary for defining reviews, which is listed as a draft. I guess Google assumes it won’t change too much in the future.)
When you use a microformats vocabulary on your webpages, you first define which vocabulary you are using by <div class=”insert vocabulary here“>. Then you define the term you are using from that vocabulary by <span class=”insert property here“>.
Example: <div class=”hreview”>
Then: <span class=”rating”>
(*note – “div” and “span” are not absolutes – use whatever tag fits with your formatting)
RDFa
RDFa is a practical way of using RDF – Resource Description Framework, which is a description format using “triples” of subject – property – value of the property (or, in their technical terminology, subject – predicate – object). For example, using our brownie example from last post:
1 cup brown sugar
This statement is about the subject of brown sugar. It contains an inherent property, which is the amount of brown sugar, and the value of the property which is 1 cup.
There are a number of RDFa vocabularies on the internet – just like microformats, the potential number is infinite. If you want to write an RDFa vocabulary, go right ahead. The difference between microformats and RDFa is that while microformat vocabularies are all hosted and edited on the microformats.org website, anyone can write an RDFa vocabulary and host it wherever he wants.
But if the framework is so open, how in the world (wide web
– does a computer reading your webpage’s code know what you’re referring to?
When you use an RDFa vocabulary on your webpages, you first define the “name space” – the place on the internet where the list of terms and definitions is located. You do this by writing: <div xmlns:abbreviation for your vocabulary=”URL where your vocabulary is located“ After that, for every property you want to mark up, you write: <span property=”abbreviation for your vocabulary:property>
Example: <div xmlns:dc=”http://purl.org/dc/elements/1.1/"
Then: <span property=”dc:title”>
Huh? Does that sound confusing to you? It did to me… and it still does. Let’s take it slowly.
xmlns is short for “XML name space”, meaning the XML document where your vocabulary is defined.
dc in our example was the abbreviation that the creators chose to stand for the URL where their vocabulary is located (it stands for Dublin Core).
http://purl.org/dc/elements/1.1/ is the URL itself.
Now the computer reading the RDFa knows that when it sees dc in the future, it is going to refer to a term contained within that URL, thus eliminating the need to rewrite the URL of the vocabulary source every single time you define a property. When you add span property=”dc:title”, you are effectively telling the computer to go to the URL above and find the “title” section to understand what the property means. The above URL actually redirects to http://dublincore.org/2008/01/14/dcelements.rdf#, making “dc:title” roughly equivalent to http://dublincore.org/2008/01/14/dcelements.rdf#title.
So now you’re about to go create your own RDFa vocabulary so you can mark up your website about your collection of Madagascar hissing cockroaches, with properties like: name, length, favorite food, best trick, decibel level of hiss…
Not so fast. If you look at Google’s instructions for marking up your website with RDFa, you see that you are instructed to write <div xmlns:v=http://rdf.data-vocabulary.org/#”. This is Google’s name space, where it defines the properties it supports in RDFa. Currently Google only supports properties relating to people, reviews, products and organizations, and only information relating to people and reviews is actually used for display purposes. In their latest update on Rich Snippets, Google announced that they will recognize FOAF (the Friend of a Friend vocabulary) and vCard terms that are equivalent to the terms they support. vCard is one of the earlier published web standards for defining properties of people and organizations – microformats’ hCard and Google’s RDFa definitions are based on this standard.
Which vocabulary to use?
So with your newfound awareness of semantic markup, you now want to go and mark up all your pages using a vocabulary that Google recognizes (your Madagascar hissing cockroaches will have to wait – probably for quite a long time).
But… which vocabulary should you use? The following is a bit of conjecture, without any definitive conclusions, but may provide food for thought as you make your decision.
Popularity
Microformats have historically (as historical as you can call relatively recent technology advances) been the more popular of the two. A search on Google Trends shows that “microformats” were searched for a significant enough way to register starting from 2006, whereas “RDFa” only appeared on the significant search scene starting in 2009 – and it’s had its ups and downs.

Google Trends for Microformats and RDFa - since 2006
A closer look at 2009:

Google Trends for Microformats and RDFa - 2009 only
On the other hand, the quantity of “microformats” search seems to be declining since 2006 – slowly, to be sure, but steadily.
Ease of Use
Microformats seem a little bit simpler to use: take a look at Google’s example for the difference in marking up information about a person in microformats and in RDFa (you have to scroll down a bit on the page for the examples).
On the other hand, do or will CMSs (content management systems) offer built-in support for one or the other? In the comments to the last blog post on this topic, Amir Simantov mentioned that Drupal has some built-in RDFa capabilities.
If you don’t use a CMS that has those capabilities, can your site programmer build either one more easily into your templates? (For a client of ours who is building a new e-commerce website, we’re recommending that the programmer look into building these semantic mark-up capabilities into the product pages.)
I posed a question to the SEOmoz staff about preference in which vocabulary to use, but they also hadn’t come to any conclusions. They did confirm my hesitation about mixing and matching vocabularies – so once you pick one, stick with it.
Are there any other points you think are relevant to deciding which one to use? Have you had experience with either microformats or RDFa and can share some advice with us? The comments are waiting for you.
(Okay – how many of you actually clicked on the Madagascar hissing cockroaches site? Admit it – in the comments.
)