Saturday, March 24, 2007

Now on Youtube: Good move or Copyright nightmare?


When talking about Google’s takeover of Youtube (website, news), some people may think it is a nifty business move. For the company's legal team, however, it may soon turn into a long and nasty nightmare.

I found an interesting vote on MSNBC website.

Is acquiring YouTube a good move for Google? (there are total 3023 responses.)

Yes, this makes them even more dominant: 55%

No, YouTube is this year's version of Napster: 22%

No, YouTube is just the flavor of the month: 18%

What is this YouTube thing you speak of? 5.5%


Showing videos on the internet is nothing new. Their clever idea was to create a model that makes it easy not just to watch the films, but also to share them.

Want to show a film on YouTube? You don't have to mess about with video standards. Just upload your film and the website does all the heavy lifting. Just make sure you have labelled the clip correctly, so that the rest of the world can find it.

Watching is just as easy.

No worries about having the right video player. Plus you can rate films, recommend them to friends, comment on them - and even integrate them into your own website, without any technical knowledge.

Little wonder that YouTube has been a huge success.

In August 2005 the site had a measly 2.8 million users a month. One year later YouTube’s audience had grown to 72 million people.

This has created its own dynamic. People will post their films on YouTube because that's where the audience is, and the audience will grow ever larger because of the extra content.

It's social networking in overdrive.

Until now most copyright holders had little incentive to sue YouTube. The company was young and rapidly burning through its venture capital. However, now that YouTube is part of the Google, with a market capitalization of $129 billion, there is a serious incentive to let the lawyers off the leash.

Google and Youtube obviously don't see ourselves as content pirates.

I argue that they act fully within the law, based on general "fair use" standards, and more importantly the safe harbor provisions in section 512 of the US Digital Millennium Copyright Act (DMCA) of 1998.

The act was designed to ensure copyright protection works in the digital age - although its authors clearly did not anticipate today's dynamic and on-demand digital world.

Section 512 helps "service providers" to avoid liability for acts of copyright infringement committed by third parties; it gives them a safe harbor.

It's a complicated piece of legislation, but here is one example of how it is supposed to work:

A service provider (YouTube) stores material (a pirated movie clip) on its system at the direction of a user (YouTube member). Meeting these conditions may help it qualify for the safe harbour provision - at least as long as YouTube makes it easy for copyright holders (a film studio) to complain about the infringement, and quickly removes pirated material that has been brought to its attention.

So far the US courts have failed to rule what most of the DMCA's statutes actually mean. Already, though, a couple of "512" defences have gone disastrously wrong - for file-sharing services Napster and Grokster. Some copyright experts wonder how it will play in the courts if Google has made advertising dollars on the back of pirated material on YouTube. They also predicted YouTube will continue to get sued.

No doubt Google will have to work hard to steer YouTube into safe waters. Solid content identification, video watermarking, royalty reporting and clearer upload guidelines for YouTube members are a must.

Wednesday, March 7, 2007

Not trapping users' data = GOOD


When users get what they want from you quickly and easily, they’re more likely to come back next time. (Shh. Don’t tell anyone else this vital secret.) Part of that is feeling that they aren’t “trapped”–that they can leave you behind if they want.

I think Google build a very good targeting engine and a lot of business success has come from that. They run the company around the users–so as long as we are respecting the rights of end users and make sure they don’t do anything against users' interest, they are fine.

The key is that they would never trap user data.

Schmidt, CEO of Google, was asked if users could get all of their search history and export it to Yahoo. He said "we would like to do that, as long as it is authenticated….If users can switch it keeps us honest." It echoes the “send your users away happy and they’ll come back” philosophy. It also gives guidance to teams at Google.

So I started making a list of the ways that Google lets you access you data:

- Gmail. This one’s easy. Google provides free POP access so that anyone can fetch their email out of Gmail.

- Search. If you sign in with your Google account to search, Google can offer not only personalized search but also let you retrieve your search history. Mihai Parparita did some digging a while ago, for example. The ability to securely access your search history as an RSS feed is documented in our help pages now. For example, the url https://www.google.com/searchhistory/?output=rss works very well if you’re logged into your Google account.

I believe you can add things like “&num=250″ so that you don’t have to access 10 items at a time either. This feature is secured by password-protection (you have to be logged in), but it provides a nice way to access your own searches. Oh, and don’t forget to try out your personal search trends. If you’re logged in, the url is http://www.google.com/psearch/trends and you’ll get all sorts of neat data like your most frequent searches, clicks, and when you tend to search.

Okay, enough about search. Let’s look at some other products that let you get to your data easily.

- Google Docs and Spreadsheets let you export your stuff in more formats than I know: Word, Rich Text Format, CSV, HTML, XLS (Excel), and PDF. Even one I didn’t know: .ods? Ah, OpenOffice. Nice. :)

- Google Calendar. Also easy. From its launch, Google calendar has allowed iCal (.ics) and RSS export of calendar data.

- Google Talk uses the open XMPP protocol. The VoIP part of Google Talk is done with Jingle, another open protocol that Google helped with. I like that our IM chat is open to other clients, so you can talk from iChat and GAIM to Trillian Pro and Blackberries.

- Google Reader easily exports your list of feeds in OPML format, and can import OPML files as well.

- Blogger. Blogger can export data via FTP or SFTP and backup your blog.

- Google AdWords. I don’t use AdWords myself, but Google provides a free application called the AdWords Editor, and its features include a snapshot export feature: “Save a delimited file with your AdWords account information and show it to a colleague or keep it for reference.” So I’m assuming it’s not too hard to suck down your AdWords info. Yup, a couple minutes of searching found references to importing your Google ad campaigns into Microsoft and Yahoo.

- Google Groups. I was dreading checking on this one. Back in August, someone wrote to me and said “I run a Google Group with 7,500+ subscribers and I need to download the subscriber list, but I don’t see an option for that.” It turns out that we didn’t offer that as a feature back in August. We were able to help the fellow, but it didn’t sound like an often-requested feature, so I didn’t think the Groups team had gotten a chance to do this. But I checked and it looks like the Groups team got a chance to add this. Yay! For a group I owned, I clicked “Manage” and then “Browse membership list.” At the bottom right will be a button “Export member list” and clicking that will download a comma-separated value (CSV) file.

- Let’s see, where else can you store data at Google? Ah, a Custom Search Engine. There’s even a bookmarklet to let you add sites to your custom search engine as you surf the web. Can you get your entire list of sites exported from your Custom Search Engine? Yup. Go to your search engine’s control panel and click on the “Advanced” tab. You’ll get options to download your sites in XML or tab-separated value (TSV) file format.

- Lots of products like Google Analytics and the Google Webmaster Console also give options to export data in various formats.

Okay, so looking down this list, it looks like Google does pretty well in offering open access to your data, at least for all the important services that I checked. If you know of some way that Google doesn’t let you download your data, please feel free to mention it.

Sunday, February 25, 2007

Some thoughts

Recently, I saw Elinor Mills writing about an interesting allegation against Google. I’ll include the whole content of the allegation: [Just to clarify, this is an allegation that Elinor is passing on from a newsletter, not a claim that Elinor is making herself directly.]

In the past, when you launched a website, or Google wasn’t picking up your stuff, you could call the friendly people over there and they’d look at your website to see if you were legit, look at their search results, and adjust their code appropriately. It used to be this all occurred in the same day. Then it was 24 hours. So, imagine our dismay when www.wesrch.com wasn’t even being picked up two weeks after we launched. We had called Google two days into the launch and they apologized, saying their search engines were backlogged with so many sites to monitor. We called after a week and then called again and again, with no better answer. We even tried posting ads with Google and they couldn’t find us. “Clearly, we had tried their patience, as in the end they threatened to BLACKLIST our websites so no one would ever find us again. Now is that power or what? Funny thing is, Yahoo found us faster and more reliably. So, Google is no longer my home page. More importantly, they are showing all the signs of a monopolist trying to forcibly extract revenues for nothing. Whenever this happens, it’s a sign that revenue growth has peaked and they are trying to force it in order to maintain high stock valuations. So watch out if you are an investor.

When Elinor asked for a comment about this, several of Google's employees read the original complaint, and they responsed that they were perplexed. Google doesn’t provide phone support for webmasters; as Vanessa Fox recently noted, over 1 million webmasters have signed up for their webmaster console alone, so offering phone support for every site owner in the world wouldn’t really scale that well. Elinor talks about buying ads later in the paragraph; Google staff wondered “maybe they were talking to phone support for AdWords?” But they can’t imagine anyone at Google on the ads side or anywhere else saying our search engines were backlogged with too many sites to monitor. The Google index is designed to scale to billions of webpages, and it does that job pretty well. It’s even harder for them to imagine anyone at Google saying on the phone that they would “BLACKLIST our websites so no one would ever find us again,” because again, Google don’t provide webmaster support over the phone, and the staff believe AdWords phone support would know better than to claim their index was backlogged or to threaten to remove anyone’s site from our index. Maybe a call to AdWords support reached such a fever pitch that a representative declined to run an ad?

At any rate, the staff felt sorry for any negative interactions that wesrch.com had with Google. The current description of the issue doesn’t give enough concrete details to check out, but if anyone from that domain wanted to clarify or to provide emails or dates/times/names of phone calls (did they call AdWords? Randomly try to hop into the Google phone tree? Talk to a receptionist?), their staff would be happy to try to look into it more.

In the absence of more details about their interaction, I tried to dig more into the crawling of wesrch.com. I didn’t see any negative issues (no spam penalties or anything like that) for the domain. I saw attempts to crawl the site as far back as October 2006, but that earliest attempt got an authentication crawl error (that would have been a 401 or a 407 HTTP status code). I believe that this allegation went out Feb. 2nd, and I believe we had at least one page from that site at that point. I did notice that visiting the root page of the domain gives a 302 (temporary) redirect to the HTTPS version of the domain. That’s kinda unusual, but we should still be able to crawl that.

The other thing to look at is current coverage. Here’s what I saw:

Search Engine

Number of pages

Google

over 450+ pages

Yahoo

1 page

Live

about 176 pages

Ask

0 pages

(Note that if you just do [site:wesrch.com] on MSN/Live, you might get results estimates as high as 500+ results, but the way to verify results estimates is to go to the final page of results, and MSN/Live stops after 176 results.)

It looks like Google crawls wesrch.com at least as deeply as any other major search engine. I’m still confounded who the folks at wesrch.com could have talked to at Google, but I’ll leave open the offer to dig into it more if they want to provide more details. And I’ll wish them well for their new domain in the future.