The natural form of a book is as a nested hierarchy of parts. Sometimes this hierarchy is shallow, as for a novel with a prologue and ten chapters. Sometimes it is a bit deeper — perhaps the book has several volumes, each with front-matter containing a foreword, introduction and acknowledgements, a number of Parts each containing chapters, each containing sub-sections.
Zhook allows book designers to accurately model the hierarchy of the book within a single document, using semantic HTML structural elements. This is powerful for a few reasons.
For display and optimisation reasons, some Zhook reading systems will prefer to split a book into several components and display each one discretely.
Here’s the algorithm such reading systems are expected to follow. It’s pretty simple.
The initial list of components contains just the
<body>element.Walking the DOM in node order, depth first: each immediate child of the current component that is an
<article>element should be extracted from the document and placed at the end of the list (thereby becoming the current component), unless it has subsequent siblings that are not<article>elements.Remove any components that contain only whitespace.
<article>?According to the HTML5 specification, "the article element represents a self-contained composition in a document ... that is intended to be independently distributable or reusable."
That's pretty much exactly the notion of a component in a book.
Here’s a sketch:
body A
- article B
- article C
- article D
- section 1
- section 2
- article E
- heading 3
- article F
This produces the following components:
The following document will not componentise at all, because the
articles are inside section elements.
body A
- section B
- heading 1
- article C
- heading 2
- article D
- paragraph P1
- section E
- heading 3
- article F
- heading 4
- article G
- paragraph P2
- section S1
- paragraph P3
So lets change sections B and E to articles.
body A
- article B
- heading 1
- article C
- heading 2
- article D
- paragraph P1
- article E
- heading 3
- article F
- heading 4
- article G
- paragraph P2
- section S1
- paragraph P3
Now the document has three components: Article B, Article E and Article G. Note that Articles C and D are not components because of the subsequent Paragraph P1 in Article B. Similarly, Article F is not a component because of that pesky Heading 4.
Like the one before it, this is a perfectly valid structure. But if we wanted to allow Articles C, D and F to be components, we could modify the structure again, like this:
body A
- article B
- heading 1
- article C
- heading 2
- article D
- article
- paragraph P1
- article E
- heading 3
- article F
- article
- heading 4
- article G
- paragraph P1
- section S1
- paragraph P2
As you can see, now the tree has two important characteristics:
The upshot of this is that now every article element in the document can be extracted by the reading system as a component.
Nope. Making a conformant Zhook Index file does NOT mean that you "have to use articles" and "only have articles that descend from the body or other articles", and "not put other elements after articles".
A conformant Zhook Index file is any valid HTML5 document. But if you want to make life easier for componentizing reading systems (which will be most of them), it's worth considering these guidelines.
Obviously, componentizing significantly modifies the DOM. Style rules that rely on an article within an article, or an article under an article, may not be applied if an article is extracted into a component. That is:
article article { ... }article + article { ... }These sorts of style rules won't work if a reading system is componentizing, and doing it correctly.