According to the BBC, Rupert Murdoch will try to block google from using `news’ content from his companies. As you can read here, he can already do this easily by just requesting that google remove his websites from their news index by using the google news opt-out form. Even better he can create a robots.txt file on each of his webservers to prevent indexing of his site by any other webcrawler which respects the `Robots Exclusion Protocol’.
Another alternative is to use a header like this on each page he doesn’t want to be indexed:
<html>
<head>
<title>Faux News</title>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
</head>
Interestingly a number of newspapers think that the extra traffic sent there way by google is a problem, as you can read in the blog post I already linked to once above. I think they have good cause for concern, but I don’t really know what they could do about it, unless they want to generate better content than other indexed websites. I think for these traditional media once they go online it will be really difficult for them to keep there customers, and that is probably why they want to go for a closed-content model. The trick is that they have to prevent all the news from being available online for that to work, otherwise people will see the competitor content too, or again generate something that people really want to read.
Google no doubt already pointed this out to the newspapers many times, as you can see for example, here.
Filed under: News | Tagged: Google, google news, Murdoch, Newscorp | Leave a Comment »