Schema-org, microformats and more science please

[...] fberriman » Schema-org, microformats and more science please Source: fberriman.com [...]

I was just about to post about the study by Hixie regarding itemscope and itemtype. I was quite saddened to read the results from said study and can't help wondering if this really was statistically significant enough to let it impact the specification like this.

Like the poster, I had a wtf moment reading about itemscope, and found the hixie study while looking for what could possibly be the reason for choosing this syntax.

I posted some notes about it here: http://blog.whatwg.org/usability-testing-html5

Unfortunately the raw data can't be published for privacy reasons.

I was really surprised by the itemscope/itemtype thing helping people, but it was a really stark result if I recall correctly. Originally I'd designed it with just one attribute "item", whose value was optional but if present was the type. Confusion abounded in the usability lab when we tested that variant. We had a variant with the attributes split more or less like it is now, and the participants in the study were far more comfortable with that. One of the people who was tested on my original design saw the split variant near the end of their session, and it was like they had an epiphany.

It was quite an educational experience for me as a language designer. Things that I thought were obvious (URLs are too long and unwieldy to be used everywhere, terse markup is better than verbose redundant markup) were repeatedly shown to be false. It really changed how I design languages.

HTH.

Yeah and down with <section> too!

damnit.

IIRC the reason for having both itemscope="" and itemtype="" came from the Microdata usability study done at Google. The participants in the study made fewer markup errors in the two-attribute case.

James: fixed ;)

Edward: interesting! Do you happen to know if that study is published anywhere? Mostly just curious.

Gavin: Well, yes... except then you'd be in violation of DRY.

No, no it doesn't. Okay, the 2nd part above should have been in a script tag with type text/turtle.

http://gavin.carothers.name/turtle-in-html/london-vcard.html

That's using the data embedding features of script in HTML 5. See: http://dvcs.w3.org/hg/rdf/raw-file/default/rdf-turtle/index.html#in-html

I do wonder if something like:

Department for Transport

Great Minster House 76 Marsham Street London SW1P 4DR

Telephone: 0300 330 3000 Website: www.dft.gov.uk Email: firstname.surname@dft.gsi.gov.uk

@prefix v: . [] a v:VCard; v:fn "Department for Transport" ; v:adr [ v:extended-address "Great Minster House"; v:street-address "76 Marsham Street"; v:locality "London" ; v:postal-code "SW1P 4DR" ; ] ; v:tel [ v:Work; rdf:value "0300 330 3000" ] ; v:url ; v:email .

Wouldn't be easier?

Now to find out if this comments field does escaping ;)

Ian:

I agree with Karl about the methodological concerns here. 7 is a pretty small sample. More to the point, AFAICT this is a test of "can n00bs learn a thing this way" vs. "what works best over long-term use" seems to be something not studied by this survey. A study like this could be constructed using new forms of elements the participants already know that are designed to be "clearer" in this way (e.g., an output type="video" vs. the video tag).

Perhaps we can eventually say that what's good for new users is good for the experienced as well, but this research doesn't seem to explore that, even in the small population. But it's good to know what's good for new users too.

Regards

You may be interested in the Data-Driven Standards Community Group at the W3C:

http://www.w3.org/community/data-driven-standards/2011/11/07/launch/

Ian:

Which parts of the raw data prevent it being published?

Excellent to hear that decisions are evidence-based, but without publishing the data it can't be reviewed & debated.

I know it's not your intention, but the whole... "Evidence proves I'm right!" "Can I see this evidence?" "No, it's secret" ...thing is the folly of quacks

I'm not sure that you can have a single attribute doing double-duty as both a boolean and a value, if you see what I mean.

Consider if you got rid of itemscope and just had itemtype instead - what would you do in an XML serialisation? You'd need to do itemtype="itemtype", then the specification would need to use that as a reserved value, which would start to get messy.

That was one massive WTF moment I had (and asked about, never getting an answer)

the itemscope is utterly pointless. The scope of the property is /always/ the scope of the tag to which it is applied. Making this extremely verbose and rather confusing.

I only stumbled across this today. I was not aware that Microformats were still going. I thought they had died through lack of traction. Don't get me wrong, I think the data driven semantic web is the way forward for many things. Including the open government, data exchange, commerce and the Internet of things.

But I have an open question, which is possibly rather naive. Why be so concerned about how a Microformat is constructed? If the data is to be read by computer then it needs to be in a sensible format for machines, no matter how complex that maybe. Most large scale web sites are created programatically, so as long as the application code is constructed (once) correctly any number of Microformats can be created without error.

I appreciate the lack of elegance in the code, but surely they are solved, or at least hidden, once the web application is written.

Mark (out of work right now, with clearly nothing better to do than learn new stuff!)

In my work I build websites that are used mostly in environments with poor, often expensive and flaky internet connection. My vacations usually also take me to such places. Which makes me care A LOT about page sizes.

I use microformats when applicable, micro-data when needed and never so far schema.org, but I also ditch them if I find my page balooning too much. It can be a very very mild version of Sophie's choice.

Not every place is wired like most (but not all) of Europe or Sillicon Valley and I wish we would take this more into account when creating new standards. I expect situation will improve eventually, but then again, standards will change too. HTML5 is afterall a living document.

Sorry if I am ranting too much.

[...] But, perhaps ironically, it was Ian Hickson who finally swayed me. When researching microdata and RDFa Lite, I noticed that a table comparing RDFa Lite and microdata said that microdata’s itemscope attribute is “not needed” in RDFa. More pertinently, it’s not really needed in HTML5 microdata, either. As Matt Wilcox writes [...]