Friday, September 12, 2008

ACAP Publisher Content Specifications Session at ONA08

Fundamental position is to guard copyright for content publishers. Problem on network is "not copyright" but absence of tools to make copyright work. This is a move by the publishing sector to watch out for itself.

The "answer to the machine, lies in the machine" so they need to have permissions, licensing, syndication to be clearer and communicated throughout electronic, networked communication.

They claim that few publishers have no care about terms & conditions. But the written policies are not machine friendly. So they are looking for a machine-readable and machine-understandable policies. This group is pushing for widely adopted terms, e.g. standards for an industry, standards that will scale and be flexible enough to work in the future.

Automated content access protocol (ACAP) is first for B to B environment, so that access and policies could be handled by machines.

Question: What is wrong with the tags now being used by machines?

Their v1.0 is "unambiguous," not anti-search, nor anti-search engine -- but who else is copying content using crawlers? There are the spam bloggers, for example, who monetize your content but bring you no value.

This is based on robots protocol. There are extensions done by major search engines, some is compatible across agents, some is unique to particular search engines. Publishers say that robots doesn't give them enough content control. "Existing robot protocol" is not sufficient, so ACAP developed a pilot project which is standardized, machine-readable, could use for special cases, can communicate with search engine crawlers, protect content -- but give search engines access to protected content. Want to deal with "take down" problem of libelous material that publisher takes down, but the material remains cached and thus available online.

Google, Livesearch, yahoo are the "major search engines" they work with in this endeavor. None of the big three have adopted ACAP standards, though they are still talking about it.

Robots Exclusion Protocol (REP) was built at request of search engines. They elaborated on these exclusions. Also had to create a Dictionary and crawler identification to tell the bad robots from the good robots. And, the malware ones are acting more and more like humans and it is difficult to tell if the crawler is who it says it is (IP switching.)

Jarvis asks how you know search vs. news, and now the issue that this is something that publishers want to put on the Internet, which doesn't exist on the web.

They want a "usage purpose" to be in the robot exclusion (e.g. image, news, etc.), limit permission to specified uses. Details on ACAP.

"No archive, just link back to site" and no crawl news, just sports are examples of what the standards were doing. Now one ids crawler and enables or restricts premium content.

My sense of this, is that some of this well-meaning, and some might be useful, however, some of it is very "anti-Internet" in the sense that it would create tiered content, and impede searches. The presenter stepped back from questions of whether this is right or wrong, but it certainly is not without controversy.

Trying to self-regulate, rather than do this through the legal system. Specs were just completed a year ago, and now they want to move into implementation. The major search engines must listen and adopt the specs. They need to spec out elements of any "page" e.g. page, image, news, headline, etc.

The group now wants sites to adopt ACAP, even though the instructions may not be implemented -- no practical impact and its free. Their effort now is to unite publishers into a large cohesive group. Most sites are in Europe, but 50 or so are USA implemented sites. Most sites who use ACAP are "high-value" content sites.

They critique REP because it is fault tolerant, and will ignore "bad" specifications.

Question: Doesn't this change some neutral quality of the Net and put the content on the net under the control of publishers.

mark.bide@rightscom.com

Jeff Jarvis: Challenges this effort as a threat to journalism. It attempts to undercut the link economy. Instead, bad links should be dealt with for what they are. He says the AP vs bloggers conflict is an example simply trying to control content.

I think this is an old-fashioned way to try and handle online content. I think it is wrong-headed because it is counter to basic principles of WWW.

No comments: