XML to XHTML rendition and stylesheets via XSL•FO by RenderX - author of XML to PDF formatter. Originally used stylesheets by P.Mansfield.

Remixing RSS: Past, Present and Future

Copyright © 2005 Roland Tanglao

Keywords: content management, content, repurposing, metadata, semantic web, web services, RDF, RSS, remixing

Abstract

RSS (Really Simple Syndication; there are other acronyms but this is my personal favourite) was developed in the 1990s to enable automatic web surfing. An RSS 'feed' is simply an XML file using a very simple XML format with the latest updates to a site. Users can subscribe to an RSS feed using an RSS reader and be automatically kept apprised of that site's updates without manually having to check the website.

RSS was popularized by blogs, but it has been adopted by mainstream media sites like the New York Times, BBC, the Economist etc., by search engines like Yahoo, Google, etc., by RSS search engines like PubSub, Feedster, Technorati, etc. and is being built (or is already built) into popular software and operating systems like Microsoft Office, Windows Vista, Mac OS X, Safari, Firefox, etc.

The widespread adoption of RSS has enabled a user generated content revolution. How? Since RSS is XML, it is machine readable which enables near real time search engine indexing. This allows users to have almost real time distributed conversations on the web. The early adopters and VERY early majority on the web are now publishing their content using RSS to take part in this conversation.

A large part of this conversation is through "RSS Remixing". In its crudest form, RSS Remixing is bloggers subscribing to an RSS feed, and remixing it by quoting part of the text, adding some commentary and then publishing this on their blog. This updates their RSS feed which others can remix as well.

This has been done manually by users and in code by developers since the early days of RSS and is now being done increasingly automatically using tools such as Drupal (an open source content management system), Radio UserLand, FeedBurner and many, many other tools.

These tools enable any power user and early adopter to be an RSS remixer without any software or XML knowledge. Previously, this power was only available to developers and XML experts!

This high level presentation will briefly introduce RSS and blogs and then illustrate with examples how users remix RSS today and in the past and also discuss the implications for knowledge management and sharing in a world where this kind of re-mixing is ubiquitous and where RSS is mainstream (as it will be post 2007 when it is part of Windows Vista) and eventually becomes transparent.

Table of Contents

1. Introduction to RSS and the Global Online Conversation    
1.1. What is RSS?    
1.1.1. User Definition of RSS    
1.1.2. Technical Definition of RSS    
1.1.2.1. What is an RSS Feed?    
1.1.3. What is a ping?    
1.1.4. How the Global Online Conversation Works    
2. What is RSS Remixing?    
3. Remixing - the Past    
4. Remixing - the Present    
5. Remixing - the Future    
5.1. Better Data: Microformats    
5.2. Better Algorithms    
5.3. New and Better Tools: Social Office and Portals    
5.4. RSS Becomes transparent    
Acknowledgements    
Bibliography    
Biography
Footnotes

1.  Introduction to RSS and the Global Online Conversation

RSS Remixing started with the human global online conversation. Therefore, this paper will begin with an introduction to this conversation and then discuss the past, present and future of RSS remixing.

1.1. What is RSS?

1.1.1. User Definition of RSS

RSS Really Simple Syndication is a two part [1] indication to subscribers and search engines that a website (most commonly a blog but more and more websites and in the future anything on the internet that has updates):

1.
 
An indication that the site has updated. This is called a 'ping'.
2.
 
A list of updates to the site. This is in XML format and is also called an RSS "feed".

1.1.2.  Technical Definition of RSS

1.1.2.1. What is an RSS Feed?

RSS is an XML file format. There are other acronyms but Real Simple Syndication is my favourite.

RSS is a very simple XML format which has three popular variants:

 
 
 

Currently, RSS 2.0 is the most widely used, but all the variants are equivalent from the point of view of a "user remixing RSS" . This is unwieldy so henceforth they will be referred to as "remixers".

Here's an example of an RSS 2.0 file modified to show two items from the excellent [Hammersley2005], Developing Feeds with RSS and Atom. It demonstrates how simple RSS is.

Example 1. Barebones RSS 2.0 Feed Example

<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>A Very Simple Feed/title> <link>http://example.org/index.html</link> <description>A Very Simple RSS 2.0 Feed</description> <item> <pubdate>Mon, 03 Jan 2002 0:00:02 GMT</pubdate> <description>Roland cool description for item 1</description> <author>Roland Tanglao</author> </item> <item> <pubdate>Mon, 03 Jan 2002 0:00:01 GMT</pubdate> <description>Jim cool description 2</description> <author>Jim Smith</author> </item> </channel> </rss>

The example omits the enclosure element which is used for podcasting. This subelement of the item tag is a link to an audio, video or other multimedia resource and is used by podcasting clients to automatically download audio to an MP3 player.

The most important thing to note here from a remixer point of view is that the description element which is a sub element of the item tag and contains the content of a blog post, is an unstructured "blob of text" The text may be HTML but that is all one knows. The HTML is not in any a priori known structure.

This is one of the great strengths and weaknesses of RSS. It is a strength because you can put anything in there and it is very simple to generate and can therefore even be hand-coded. It is a weakness because this means that software cannot assume or derive anything about the blob of text.

Attempts are being made to "fix" this via microformats which is also known as structured blogging. More on this in Section 5.1, “Better Data: Microformats”.

1.1.3. What is a ping?

A ping is an indication to search engines and other services that a site has updated. It started off as a standalone message just to indicate that a website has updated but now it is usually generated automatically together with an updated RSS feed.

Pings are implemented using a very simple [XML-RPC] message as documented in the [weblogs.com update notification spec]

1.1.4.  How the Global Online Conversation Works

A detailed diagram of how RSS and pings work together to produce the Global Online Conversation.

Figure 1. How the Global Online Conversation Works

Refering to How the Global Online Conversation Works figure:

For the purposes of this example, "Joe" and "Jane" are two bloggers.

1. Joe writes something and publishes it to his website with RSS. It's probably a blog but it could be any site like the BBC or the New York Times that publishes RSS

2. Joe's system updates his site's HTML, updates his RSS file and sends a 'ping' message to the 'Aggregation Ping Server' indicating that his site has updated.

3. Search engines like Google and RSS specific services like Feedster, Technorati and PubSub periodically ask the Aggregation Ping Server, "Which sites have updated?".

4. Since Joe's site sends pings and has an RSS file and is easy to update frequently, and is updated frequently, Joe's site gets re-indexed quicker and Joe's search engine rank is higher than a 'normal site' without RSS

5A. Teresa uses a program called an RSS reader (e.g. FeedDemon [http://www.feeddemon.com/] on Windows, NetNewsWire [http://ranchero.com/netnewswire/] on Mac, or an online web application like Bloglines [http://bloglines.com/]) to subscribe to Joe's site. The RSS reader (also called an RSS aggregator) checks Joe's RSS file for updates periodically (usually once/hour or once per day) and notifies her of Joe's updates. Teresa no longer wastes time manually surfing Joe's site. She just checks her RSS reader. As a result, Teresa's information flow is more efficient and she can monitor more sites in less time.

OR

5B Teresa either does not use an RSS reader or does not subscribe to Joe. She finds his stuff through a search engine. Joe's stuff is easier to find because his search engine ranking is higher because his site has RSS.

6. Teresa disagrees with Joe and posts it to her blog.

7. Teresa's system updates her site's HTML, updates her RSS file and sends a 'ping' message to the 'Aggregation Ping Server' indicating that her site has updated.

The above seven step cycle can now repeat with Joe playing the role of Teresa and vice versa. And also it is not confined to Joe or Teresa. It can be ANYBODY on the internet. And a blog is not needed to participate. All that is required is something that publishes RSS to the internet. It can be something as rudimentary as a social bookmarking service like del.icio.us [http://del.icio.us/].

Voila! Near real time global, distributed, conversations between two or more people as shown in the Simplified Global Conversations Diagram

A simplified diagram of how RSS and pings work together to produce the Global Online Conversation.

Figure 2. How the Global Online Conversation Works (Simplified)

2. What is RSS Remixing?

In this paper, RSS Remixing is defined as taking an RSS feed in, processing it somehow (manually or programatically) and producing an RSS feed out. Users want to do this to extract the knowledge they want out of the Global RSS Conversational 'noise' using search, analytics, etc.

Of course, developers have always done this and will continue to do this by creating software. This paper focuses on remixing without writing code and without using tools that require detailed XML domain knowledge like XSLT.

3. Remixing - the Past

In the early days (around 1999-2001), the original remixers remixed manually. They cut and pasted HTML code from their browsers into their blogging tool and added their annotations and blogged it back into the global conversation. This worked well when there only a few thousand people in the global RSS conversation and one only paid attention to a few.

Developers and alpha geeks were the first to take this to an extreme and to start monitoring many, many conversations. It did not take long for developers to come up with tools to facilitate this and remove the manual tedium. These tools included RSS aggregators (the author prefers the term RSS reader) which automatically downloaded the new information (using RSS) from the sites you subscribed to.

One of the first was Radio UserLand [http://radio.userland.com/]. It was a combined RSS reader and blogging system that automated what people were doing manually. From the reader, a remixer could remix (or "re-blog") an item from one their RSS subscriptions. With Radio UserLand (see the Radio UserLand figure), the process was simple. If one wanted to re-blog, simply click on "POST" next to the item, add your commentary and click "POST to Weblog". Two clicks. Simple and very effective remixing.

Two clicks - 1. click "POST" 2. add annotation and click "Post to weblog"

Figure 3. The Radio UserLand Aggregator - early re-blogging example

4. Remixing - the Present

For the author, the remixing present started with the advent of RSS search engines (Feedster was one of the first in early 2003) and ended when Google introduced their RSS search engine (called Google Blog Search [http://www.google.com/blogsearch]) in September 2005.

In this era, RSS moved from being just 1000s of blogs to being millions of blogs and not just blogs, also 1000s of newspapers, magazines and other more established online media outlets. No one knows why this RSS explosion started, but here are two major reasons:

 
Established media and people noticed how effective RSS was after September 11, 2001 in bringing the conversation out and giving new perspectives on the news.
 
Websites with RSS that update frequently have a higher search engine rank then websites without RSS.

The RSS Search Engines (Feedster [http://feedster.com/], Technorati [http://www.technorati.com/], PubSub [http://www.pubsub.com/], Blogdigger [http://www/logdigger.com/] et al were a remixer's dream. Born out of a need to let people find what they were looking for in an enormous number of online global RSS powered conversations, they allowed users to specify what they want and more importantly set up an RSS feed of their search.

With RSS Search engines, there was no need to search or surf the global conversation manually. Remixers set up RSS search engine feeds for the keywords, organizations, topics they were interested in and the news came to them. For more information on this refer to [How to be a NewsMaster] (the term [RSS NewsMaster] was invented by Robin Good [http://www.masternewmedia.org/]to describe professional RSS remixers which the author believes will become part of the job description of 21st century librarians). the author's primer for remixers on how to set up RSS search feed on two of the leading RSS search engines: PubSub and Feedster

Here is an example Feedster search for XML Conference 2005 . Note the orange XML badge at the top right next to "Page 1 of 6". This is badge is a link to an RSS feed which remixers would subscribe to in their RSS reader.

Note the orange XML Button at the top right - It links to the RSS feed for this search

Figure 4. RSS Search Engine Example - Feedster search for XML Conference Atlanta

There were other developments in this era as well in the areas of re-blogging and RSS feed slicing and dicing and most importantly tagging.

Re-blogging increasingly became a normal feature of remixer tools both in desktop applications (e.g. on Mac OS X: ecto [http://ecto.kung-foo.tv/] and MarsEdit [http://ranchero.com/marsedit/] and on Windows: Qumana [http://qumana.com/] and BlogJet [http://blogjet.com/]) and web applications (e.g. Drupal [http://drupal.org], an open source CMS Content Management System). Drupal's reblogging feature can be seen in this screenshot from UrbanVancouver.com. The little "b" buttons function like the Radio UserLand "POST" buttons.

Click on the "Blog It" buttons to re-blog an item from an RSS feed.

Figure 5. Drupal Aggregator - re-blogging example

RSS feed slicing, dicing, intersecting etc. or what Seb Paquet calls [Amateur RSS Bricolage] became available to remixers through web applications like FeedBurner [http://www.feedburner.com/], Blogdigger [http://www.blogdigger.com/] and the recently introduced FeedShake [http://www.feedshake.com/]

In this FeedBurner example you can see the combining of a personal blog feed (the author's blog) with the author's RSS feed of photos from flickr, an online social photo blogging service into one feed with both blog posts and photos.

In this case your blog feed and your flickr photo feed.

Figure 6. FeedBurner - example of splicing two feeds together

The final RSS bricolage example in this paper is a Blogdigger group which allows you to combine any arbitrary feeds; not just your own feeds as in the FeedBurner example. A more complete list of RSS bricolage tools can be found in [Library clips - RSS Bricolage Tools List].

In this case any RSS feeds, not just your own.

Figure 7. Blogdigger - example of splicing arbitrary feeds together

Finally, the last and probably the most significant development of the present era is that of tags.

Tags are the remixer's equivalent of keywords that were pioneered by Joshua Schacter's online social bookmark service, del.icio.us (and similar services like Furl [http://www.furl.net/], Spurl [http://www.spurl.net/], etc.). Tags are simply one or more keywords that remixers add to describe bookmarks. These bookmarks are stored on the web (not the user's local hard disk) and are by default available to all which is why del.icio.us has been called a social bookmark service. Every tag has an associated RSS feed. Most importantly, remixers can mix and match tags' feeds to do RSS feed intersection. For example, as seen in the del.icio.us example figure, one can get an RSS feed for anything tagged with both "Atlanta" and "food" say if one were travelling to Atlanta for a conference.

In this case, this RSS feed will have anything that appears in BOTH the Atlanta and food RSS feeds.

Figure 8. del.icio.us - an example of using tags to splice the intersection of feeds together

Tags have their problems (e.g. spam tags and disambiguation of tags with multiple meanings e.g. maple the software [http://www.maplesoft.com/] versus maple trees) . But they have a very low barrier to entry because to tag, there is no need for analysis or commentary as in blogs and tags allow remixers to easily bring the non RSS part of the web into the RSS global conversation with very little effort. A more complete treatment of tags can be found in [Powerbloggers turning to tags].

5. Remixing - the Future

WARNING: The folowing is the educated guesswork of somebody who has been blogging since 1999 and obsessed with RSS since 2001.

The future of remixing is now, or, to be more accurate, has already begun. In addition to that which is impossible to predict today, it will consist of tagging (as popularized by del.icio.us, flickr [http://flickr.com/] et al) which began in the previous era for a few and will continue to expand in the future for the masses and also:

 
"better" data
 
better algorithms
 
new and better tools
 
RSS will become transparent.

5.1. Better Data: Microformats

[Microformats] from an RSS perspective are an attempt to make the "blob of text" in the RSS description field into something structured in a lightweight, human readable (in the same way that HTML is human readable) AND machine parseable representation using XHTML. Microformats build on the foundation of earlier efforts such as [Joe Reger's data blogging] and the [RVW RSS Extensions for Reviews].

Imagine if people could write restaurant reviews (refer to the example below of a restaurant review in the hreview microformat taken from the [hreview page on the Microformats wiki]) and those reviews were carried in a review microformat as part of an RSS item. Then it would be very easy to write tools that would enable remixers to aggregate, splice and dice the microformats that they care about (e.g. recipes, reviews, and events). This is the world enabled by the ubiquitous use of standard microformats with RSS.

Example 2. Restaurant review example in the hreview Microformat

<div class="hreview"> <span><span class="rating">5</span> out of 5 stars</span> <h4 class="summary"><span class="item fn">Crepes on Cole</span> is awesome</h4> <span>Reviewer: <span class="reviewer fn">Tantek</span> - <abbr class="dtreviewed" title="20050418T2300-0700">April 18, 2005</abbr> </span> <blockquote class="description"><p> Crepes on Cole is one of the best little creperies in San Francisco. Excellent food and service. Plenty of tables in a variety of sizes for parties large and small. Window seating makes for excellent people watching to/from the N-Judah which stops right outside. I've had many fun social gatherings here, as well as gotten plenty of work done thanks to neighborhood WiFi. </p></blockquote> <p>Visit date: <span>April 2005</span></p> <p>Food eaten: <span>Florentine crepe</span></p> </div>

5.2. Better Algorithms

Remixers will benefit from prior art in artificial intelligence to use in analyzing the RSS "blobs of text" for filtering and other purposes. For example, the recently launched SearchFox [http://rss.searchfox.com/] uses machine learning algorithms. This trend will continue.

Another sort of prior art that remixers will benefit from is voice recognition which will be incorporated into remixer services for filtering of software generated transcripts of spoken word podcasts and videoblogs.

Developers have high quality open source software libraries for generating and building RSS feeds such as [Mark Pilgrim's Universal Feed Parser for Python] and the [Magpie RSS Library for PHP] These libraries make RSS accessible to developers without XML domain knowledge (the only knowledge required is how to use arrays, which make [RSS like a poor man's API]).

To date, however, there are no similar libraries for RSS remixing. Obviously, the RSS search engines have libraries that remix internally. In the future, open source libraries from the RSS search engines and others will emerge to make the building of software and services for remixers much easier.

Apple has already incorporated RSS into its web browser, Safari, on Mac OS X. Perhaps Apple will release an open source 'RSSKit' as they have done with the 'WebKit' which is part of Safari. RSS Support will be part of all operating systems in the future instead of being 3rd party add on libraries as it is today. At the time of the writing of this paper (September 2005), [Microsoft announced plans to build APIs for RSS remixing into Windows Vista], the next version of Windows. As a result, RSS remixing programs will become even more numerous on the Mac and Windows and of course, also Linux when RSS support is built into it.

5.3. New and Better Tools: Social Office and Portals

RSS is well on its way to becoming ubiquitous and will become a lowest common denominator of data transport and interchange on the Internet. This means RSS will be everywhere and if something on the net is not available in RSS, it will be possible via web services for remixers to convert it to RSS.

Therefore, in addition to the straightforward integration of RSS in and RSS out of programs like [Microsoft Office 12], there will be some more interesting uses of RSS. One is for new types of software like the Social Office (a new term for a suite of programs and services for the new social services which are becoming the 'office of the web' like del.icio.us bookmarking, blogging, photo blogging with flickr, podcasting and videoblogging) and the other is for portals: personal portals and portals for business and organizations.

One of the harbingers of the Social Office is the [flock] web browser, which its developers have dubbed the "Social Web Browser" since it integrates in an easy to use fashion the new social features of the web such as photo blogging from flickr, blogging using standard APIs to your blog(s), social bookmarks like delicious and more. What if flock incorporated RSS remixing into it so that remixers could do whatever they want with the RSS feeds they generate and the ones that they are interested in and display the result in an attractive and usable manner?

In other words, what if remixers could, based on their remixed RSS feeds of interest, create their own customized portal? This is the premise behind [Marc Canter's vision of a digital lifestlyle aggregator] which is for individuals remixing their feeds of interest.

The author believes that this is also the vision behind the [Tucows Start Service] which allows business to generate custom portals for business based on their RSS feeds of interest.

Furthermore, as RSS continues to become more and more used for non human events like industrial processes (e.g. number of widgets made in the last hour) and product development processes (e.g. checkins to a software development source code control repository), and it thereby becomes the the universal container for enterprise information flows, then [RSS will be the basis for a real-time enterprise console]

5.4. RSS Becomes transparent

When RSS generation, aggregation and bricolage are built into everything, it will become invisible. If the infrastructure is in place everywhere to remix RSS, then tools and applications can be built that will enable users to concentrate on retrieving and sharing what they want without having to know the technology underneath. This is the ultimate future of RSS remixing. In the long term future, like the SMTP simple mail transfer protocol, RSS will disappear and people will use it without knowing it is there.

Acknowledgements

Thanks to the users: the bloggers, podcasters and videobloggers for creating compelling content constantly [http://www.rolandtanglao.com/archives/2004/07/31/anil_its_not_about_seo_its_about_creating_compelling_content_constantly]. Thanks to the developers for RSS and the cool remixing tools. Thanks to Dave Winer [http://scripting.com/] for kickstarting the RSS revolution.

Bibliography

[Hammersley2005]
Developing Feeds with RSS and Atom,O'Reilly, April 2005.
[RSS 2.0]
RSS 2.0 Specification. Available at http://blogs.law.harvard.edu/tech/rss .
[RSS 1.0]
RSS 1.0 Specification. Available at http://web.resource.org/rss/1.0/ .
[Atom]
Atom 1.0 Specification. Available at http://ietfreport.isoc.org/idref/draft-ietf-atompub-format/ .
[XML-RPC]
XML-RPC Specification. Available at http://www.xmlrpc.com/spec .
[weblogs.com update notification spec]
Weblogs.Com XML-RPC Interface. Available at http://www.xmlrpc.com/weblogsCom .
[Amateur RSS Bricolage]
The algebra of feeds, or the amateurization of RSS bricolage . September 2003 Available at http://radio.weblogs.com/0110772/2003/09/11.html#a1098 .
[Library clips - RSS Bricolage Tools List]
RSS: filter and re-mix . Available at http://libraryclips.blogsome.com/2005/05/20/rss-filter-and-re-mix/#eight .
[Powerbloggers turning to tags]
Powerbloggers turning to tags . Available at http://www.alexandrasamuel.com/archive/today-in-the-toronto-star-tagging/ .
[Microformats]
About microformats . Available at http://microformats.org/about/ .
[hreview page on the Microformats wiki]
hReview 0.2. Available at http://microformats.org/wiki/hreview .
[How to be a NewsMaster]
Lazy Person's Guide to being a NewsMaster Part 3:How to be a NewsMaster. Available at http://bryght.com/node/148 .
[RSS NewsMaster]
The RSS NewsMaster. Available at http://www.masternewmedia.org/2004/03/02/the_rss_newsmaster.htm .
[Joe Reger's data blogging]
What is datablogging?. Available at http://www.reger.com/biz/what-is-datablogging.log .
[RVW RSS Extensions for Reviews]
RVW module for syndicating and aggregating reviews. Available at http://www.pmbrowser.info/rvw/0.2/ .
[Magpie RSS Library for PHP]
MagpieRSS: RSS for PHP. Available at http://magpierss.sourceforge.net/ .
[Mark Pilgrim's Universal Feed Parser for Python]
Universal Feed Parser. Available at http://sourceforge.net/projects/feedparser .
[RSS like a poor man's API]
RSS is a Poor-Man's API. Available at http://www.undeniablygeeky.com/weblog/2005/09/16/16 .
[Microsoft announced plans to build APIs for RSS remixing into Windows Vista]
Bill Gates interview:the transcript. Available at http://weblog.infoworld.com/udell/2005/09/15.html#a1302 .
[RSS will be the basis for a real-time enterprise console]
RSS is the goldmine but there would be no RSS without blogs. Available at http://www.rolandtanglao.com/2003/10/29.html#a5723 .
[flock]
Flock - the social web browser Available at http://www.flock.com .
[Marc Canter's vision of a digital lifestlyle aggregator]
Digital Lifestyle Aggregators Available at http://blogs.it/0100198/stories/2004/02/28/digitalLifestyleAggregators.html .
[Tucows Start Service]
Tucows Start Service - Tucows Beta Program Available at http://beta.tucows.com/blog/_WebPages/TucowsStartService.html .
[Microsoft Office 12]
Office 12 has new UI and cool RSS Support Available at http://weblogs.jupiterresearch.com/analysts/gartenberg/archives/010450.html .

Biography

Roland Tanglao
Chief Blogging Officer
Bryght [http://www.bryght.com]
Vancouver
British Columbia
Canada
http://www.rolandtanglao.com/ [http://www.rolandtanglao.com]

Roland Tanglao graduated from the University of Waterloo with a degree in Systems Design Engineering. Working at Nortel Networks, he ran its first internal corporate blog focusing on developer relations. Roland has been blogging since 1999, and was the first business blogging consultant in Canada. Roland is one of the founders of Bryght and as Bryght's Chief Blogging Officer, he reads hundreds of blogs daily through his RSS reader and participates in many online communities. He is an expert community manager, with UrbanVancouver.com and his personal restaurant review site, VanEats.com, being the two best examples.

Footnotes

  1. Technically, pings are not part of RSS, but in practise most systems that generate RSS also generate pings and pings are the key to real time re-indexing by search engines which is crucial to online conversations.
     

XML to XHTML rendition and stylesheets via XSL•FO by RenderX - author of XML to PDF formatter. Originally used stylesheets by P.Mansfield.