Toward a private blogosphere?
As you now, we have recently been under a fierce attack from a blogger who accused us to “steal” his content. The whole story is available in our previous post and also on this blogger’s blog (I let you find which blog it is by yourself).
Beyond the fact that we - a search engine whose primary mission and value proposition like any other retrieval system is to index Internet content and make it searchable by Internet users and available to them - were treated publicly like the worst criminals on earth by this blogger this case raises a real ontological question:
Are we going toward the emergence of a private Internet?
We feel there is a real absurd situation where a new breed of bloggers in the name of the right for privacy want AT THE SAME TIME
- to publish their content on the World “Wild” Web which is not really the right place for those who look for privacy,
- to have a great readership, gain a maximum audience and have many comments on their posts,
- and eventually not to be found via RSS search engines, aggregators or Digg-likes and their content being duplicated!
We are also bloggers so we now what we are talking about. Normally, when a blogger decides to publish some content on Internet he/she expects (even prays for it!) his/her content will be visible by other Internet users. So, a community forms around his/her blog with interactive exchanges, new real or virtual friendships and even some fame for the most talented ones, etc.
Myself, as a blogger, I am pretty happy to see my blog content duplicated somewhere because it gives me a greater chance to get additional readers that I otherwise would never get. Very often when I find an interesting content on a Digg-like, I tend to visit the original blog to navigate in the beautiful environment set by the blogger. Usually, on search engines and Digg-likes, there is a link just below the cached content (”duplicated” content if you prefer) which allows the readers to jump directly to the original blog. A search engine does just offer the possibility to FIND a content, reading it is an other activity of the user experience. It is far more pleasant to read a blog post on the blog itself (because of all the design features and multimedia elements proposed by the blogger) rather than the raw text of the cached content displayed on a search engine! So, we urge on the users to search for content and find it in search engines then to switch naturally to the original blog to read it in its original environment!
Search engines need to cache Internet content in order to make themselves visible on Internet and other search engines but also to make the blogs indexed and their authors visible as well and help them to get readers and the necessary audience. Without the role of the search engines, the vast majority of blogs would be perfectly unknown and would never get a single comment! They would stay in the gloomy waters of Internet! We will soon explain in details the role of “caching”.
But the problem is not to try to explain or convince the bloggers about the benefits of search engines. Most of them know the value and purpose of retrieval tools.
No, the point is that a bunch of bloggers, as written above, would like to make their content private while they are publishing it on the Internet medium. It resembles an oxymoron! But this is a real trend and more and more bloggers are starting to complain, sometimes very aggressively.
The purpose of this post is to try to find a honorable solution for everybody: 1) the Internet search systems, 2) the bloggers, 3) both parties
1) Recommendations and guidelines for the attention of retrieval tools (to us and our peers, in fact):
NB: this is the lesson we learnt from our recent issue with the blogger. In a way, the experience was useful. It is a pity this blogger was so aggressive and not open minded to discussion or patient.
- Try to cache only excerpts of an original content and not the whole content. A number of characters should be set as a standard, like -mere example - 600 characters,
- Accept all META instructions like nocache, noarchive, nofollow, etc. (we are currently fixing this for the next release of our service),
- Offer a blogger the possibility to get his URLs removed as quickly as possible through a dedicated “contact form”.
2) Recommendations and gudelines for the attention of the bloggers who want their content not to be searchable or found via retrieval tools
NB: the ones who want their online content stay private or confidential.
- Use the META instructions cited just above in 1),
- Either protect their blogs or some of their posts by a password (Wordpress offers this possibility, as an example),
- Customize their feeds so they deliver only excerpts (See Wordpress guidelines related to the “excerpts” and RSS management),
- Avoid to use the “open” Internet if you do not want your content be visible or duplicated! Web 2.0 and RSS are not books! Internet content is easily disseminated. By anybody, not only Internet tools.
3) Recommendations for both 1) and 2)
- Favor diplomatic ways of communication between the bloggers and the retrieval tools prior adopting unfriendly attitudes like sueing legally, bashing publicly or other threats. These are all very sad things we exeperienced recently,
- Imagine a multilevel formalized process. For example,
1) Blogger sends a message to the service and asks the content be removed ASAP and at least within a period of 2/3 working days (the removal process can be technically long because the service needs to refresh the database after the removal. This process can take up to 48 hours.) ,
2) If no action is taken after the first step or no response provided by the service, the blogger sends a second message with a firmer demand and a deadline (copy of the message to a lawyer can be an option),
3) With no response at all a blogger can then try to take action. But here again, it will work in some cases but probably not in others (big companies have good lawyers and are slow to react to a user demand and smaller companies can be also slow to react because their team is limited). Not to mention some services are located overseas!
This post is meant to share some reflections about this issue which will probably tend to be on the agenda with time. We do not propose here a real solution but only an insight. Our goal is to make bloggers and services talk together and set the rules together.
And you what is your opinion. What are your ideas?



April 18th, 2008 at 5:07 pm
I’m very interested towards the official implementations of NOINDEX and NOCACHE. You can do it through extending the namespace as Yahoo proposes.
<meta xmlns=”http://www.w3.org/1999/xhtml” name=”robots” content=”noindex” />
But is this an industry wide accepted standard ? And do all robots understand this?
On the other hand, there are abuses around… see http://www.emich.be/fr/2008/04/17/recette-pour-se-faire-du-fric-sur-le-dos-d-autres-blogs/
April 18th, 2008 at 10:51 pm
Yes Mike, anyway bloggers have also the possibility to customize their feeds in order to limit the number of characters to display. But not every blogger has the skills to carry out such operations on the RSS feeds.
I believe some bits of solution should be implemented on the search engine’s side. So, we will try to develop something to limit complaints even though we could do nothing. But this is not our philosophy. We would like everybody be happy with our service.
Thanks again Mike for your kind cooperation and suggestions. We appreciate much.
June 14th, 2008 at 2:56 am
I’ve come here in possible knowledge of the above situation. A friend has had several “news” agencies place most of his posts in “their” articles.(No credit.) In his fury many other agencies may have been included in the terse.My question is this, if a creative commons notice is very visible is it ones obligation to give honor to the poster with a simple notification?
As to rss feeds many new bloggers don’t have the technical skills that protect themselves.These issues are the embryos
that are growing on the placentas of the entity called Internet. Those such as yourselves can stop the licensings of the media’s Mengele’s by assisting those of which you speak.