Advanced Silva-Extra Metadata Usage and using Silva's html parser
Disclaimer:
This article covers parts of Silva's internals that aren't part of the
public API's, specifically the section on Silva's html
parser. This document was written for Silva 1.4. It
should work since 1.2 (possibly 1.1), but it is certainly possible the
api's used may change at some point in the future without any public
notice.
Note: up to Silva 1.4.1
the SilvaMetadata product, version 0.10, contains a bug that prevents
the catalog from updating when metadata is updated. This means
the publicationtime in the catalog is out-dated. If you have
version 0.10, upgrade to the latest, pull the cvs trunk, or apply the
fix described here: https://infrae.com/issue/silva/issue1415
Silva's silva-extra metadata set contains some text areas for 'html metadata' input, like subjects, keywords, and description. The normal use for these is to modify your layout template to include these in the <head> of each document (if they exist, of course). That's great, but largely (in our environment at least) these aren't used. Sometimes custom applications (or your silva-based website itself) can make very interesting use of these metadata fields.
Here's
one interesting use: using a code source to display the description
metadata field for the 5 most recently updated documents. We also
want to allow basic html markup to allow authors to add some creative
flare to their document description.
This article will touch on a couple major topics:
- Querying the catalog to find interesting silva objects (service_catalog)
- Pulling the description from each object (service_metadata)
- Using Silva's html parser to render the description (service_editorsupport)
Getting started
Since
this is going to be placed in a Silva Code Source so it is made
available to any Silva document, create a Silva Code Source in the
ZMI. Call it cs_recently_updated. Give it an interesting
title and description if you want. Create a PythonScript with id render, and make sure render is listed as the Code Sources Script
Id.
If you want to have a customizable limit
in the number of silva objects, create a parameter for the code source
called 'limit' as a text field (or whatever you want, really). In render's parameter list, put limit=X (where X is the default limit)
Querying Silva's catalog
The flow of the python script goes like this: retrieve the limit most recent silva objects, for each silva object get it's description and render it as html, pass this list of silva objects and comments to a page template to render it as html.
Note this can
be done in reverse. Your Script Id can be to a page template,
which in turn calls a python script to retrieve the objects and
comments.
Here's the code snippit to query the catalog for all Silva Document Versions updated in the last 10 days, returning limit documents:
from DateTime import DateTime
now = DateTime()
query = {'path': '/'.join(context.REQUEST.model.get_publication().getPhysicalPath()),
'version_status':'public',
'meta_type':['Silva Document Version'],
'silva-extrapublicationtime': {'query':[now-10],'range':'min'},
'sort_on':'silva-extrapublicationtime',
'sort_order':'descending',
'sort_limit':limit
}
results = context.service_catalog(query)
Renering the description
So
far so good, right? Now for the cool part. We need to
iterate through the results, get the Silva Document Versions, get the
descriptions from the metadata, and render them. Here's the code
to do just that:
gmb = context.service_metadata.getMetadataValue
gmcs = context.service_editorsupport.getMixedContentSupport
items = []
for r in results:
obj = r.getObject()
desc = gmb(obj,'silva-extra','content_description')
desc_el = obj.content.documentElement.createElement('p')
supp = gmcs(obj.aq_parent,desc_el)
#can't have unicode in supp.parse
desc = desc.encode('utf-8')
supp.parse(desc)
items.append( (obj.get_title(), (obj, supp.renderHTML())) )
items.sort()
items = [ i[1] for i in items ]
In depth (and in slightly reverse order):
service_editorsupport
has functions (non public api, I believe) that generate the html for
silva markup. As a refresher, 'silva markup' is silva xml that
appears between <p> and <hX> tags when viewing the parsed
xml of a document version in the ZMI. There are two interesting
functions: renderHTML and renderEditable. The
former renders the xml for public content, and the latter renders it
for the forms-based editor. In this example. we're using renderHTML, as we want this content rendered to the public.
In
order to use these functions, we need to retrieve a MixedContentSupport
object. There are a couple of these that come stock with
SilvaDocument (in mixedcontentsupport.py), one for ParagraphSupport and
PreSupport. service_editorsupport contains a registry of (meta_type,element) to content support objects. You use it's getMixedContentSupport(context, element) to retrieve the content support for the contents meta_type.
element
is an actual ParsedXML element. This element doesn't need to have
any content as only the tagName is used to lookupt the content support
object. You can create unattached ParsedXML elements by accessing
the current objects content attribute (with is a ParsedXML object), and
using the DOM api function createElement to create an empty <p> element.
Once we have the content support object (the supp
variable in the example), we parse the description. After the
content support object has parsed it's content, you can call it's
rendering functions (i.e. renderHTML).
A use for renderEditable
would be if you want to have a code source which allows you to edit the
description (e.g. in kupu). You could then do something
interesting with it in the code source for public display (via renderHTML) in the document.
Rendering the updated documents list
To render the updated documents list, we use a page template. The last line of render should look like this:
return context.output_html(docs=items)
The output_html page template might look like this:
<div tal:define="docs options/docs"> <h3>Recently Updated Documents</h3> <ul> <li tal:repeat="d doc"> <b tal:content="python:d[0].get_title()" />: <span tal:content="structure python:d[1]" /> </li> </ul> </div>
Conclusion
Hopefully
you've found the learnings in this article as useful as I have. I
haven't actually created a code source yet that uses all of these
principles, but I have used all of the concepts described effectively
in our cms. You may have noticed, the altepeter.net
homepage contains a code source listing recently updated
articles. The tech section of this website uses the cs_document_summary code source linked below to do everything described in this article.
cs_document_summary.zexp - Code source which does everything described above
cs_updated_pages.zexp - Code source to render a list of most recently updated pages (via catalog query)