DocVis API

Core

Docvis’ core elements are used to “build” dynamic HTML documents.

The input to these elements are common Python objects describing various user-data and the output is a well-formed HTML document.

Design

  • The key entity here is HTMLTag.

  • Every HTMLTag bears its “name”, content, attributes and external resources.

  • External resources are any additional elements that must be present in a documents <head> section, so that it renders properly within the page.

  • Here is a simple example based on the <meta> tag:

    This…

    meta_data = HTMLTag("meta", "", {"charset":"utf-8"}, is_empty=True)
    

    … would render as:

    <meta charset="utf-8" />
    

    This of course is already available as docvis.HTMLMeta.

  • And here is a simple example based on the <div> tag:

    This…

    some_script = HTMLTag("div",
                          "<p>Hello World</p>", 
                          ["base_script.js"],
                          {"class":"thediv"})
    

    …would render as

    <div class="thediv">
       <p>Hello World</p>
    </div>
    

    This is self-explanatory, but most importantly, this <div> has external dependencies (base_script.js), which will be propagated up to the HTML page’s head element by Docvis.

    This is not the way div is instantiated and used in Docvis, for that, see docvis.HTMLDiv.

  • All HTML elements produce an HTMLRenderedElement data structure that contains two attributes: extra_resources, code. extra_resources are made available to elements that need it to properly represent a given element and code is the actual HTML code that makes it into the document.

  • This design was inspired by Bokeh, but it makes sense in transforming any kind of data to an HTML representation without worrying too much about the rest of the elements in a given document. In this way, Bokeh can co-exist with matplotlib and other elements.

  • The absolute top-level “object” here is HTMLPage. This element is responsible for rendering a whole HTML page, complete with its dependencies, to a portable standalone document.

author:

Athanasios Anastasiou

date:

Mar 2023

class docvis.core.HTMLBody(children, external_resources=[], attributes={})

HTML Body element

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/body

class docvis.core.HTMLDiv(children, external_resources=[], attributes={})

Content division element

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/div

class docvis.core.HTMLHead(children, external_resources=[], attributes={})

The document header element.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/head

class docvis.core.HTMLMeta(attributes)

The _meta_data element.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta

class docvis.core.HTMLNestedTag(tag, children, external_resources=[], attributes={})

A nested tag only encloses a set of nested elements

render()

Transform the element’s data into HTML.

class docvis.core.HTMLPage(html_body, html_head)

The top level HTML page.

class docvis.core.HTMLParagraph(content, external_resources=[], attributes={})

A paragraph element

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p

class docvis.core.HTMLPassthrough(content, external_resources=[])

An element for content that is already formatted as HTML.

The content of the element is passed verbatim to its output, along with any additional dependencies.

class docvis.core.HTMLRenderedElement(extra_resources, code)
code

Alias for field number 1

extra_resources

Alias for field number 0

class docvis.core.HTMLScript(content, external_resources=[], attributes={})

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/script

class docvis.core.HTMLStylesheet(content, external_resources=[], attributes={})

A link element specifically configured to point to a stylesheet.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/link

class docvis.core.HTMLTag(tag, content, external_resources=[], attributes={}, is_empty=False)

Represents a generic HTML tag

Parameters:
  • tag (str) – The tag name WITHOUT the brackets

  • content (str) – The content of the tag

  • external_resources (list[str]) – Any external resources required for this element to render properly (e.g. script files, css, etc). List of string, one resource per line.

  • attributes (dict) – Extra attributes for the tag, as a dictionary of attr->attr_value

  • is_empty (bool) – Whether this tag should be rendered as an empty (i.e. self-closing)

render()

Transform the element’s data into HTML.

class docvis.core.HTMLTitle(content)

Document title element

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/title

class docvis.core.RenderableElement

Base class for all elements that return code or fragments of code

Markdown

Markdown rendering represents an additional layer on top of HTML.

There are two key entities in this sub module, HTMLMarkdownDiv and HTMLPreProcMarkdownDiv.

HTMLMarkdownDiv

The first one is self-explanatory: It takes a markdown template, renders it to HTML and embeds that HTML into a div element. This can then be rendered within a document. The Markdown in that element can have dynamic components which will be rendered along. For example:

This…

the_markdown = HTMLMarkdownDiv("# Heading\nHello {{name}}", 
                               {"name":"Flipper"})

…would render as:

<div>
  <h1>Heading</h1>
  <p>Hello Flipper</p>
</div>

HTMLPreProcMarkdownDiv

This element is similar to HTMLMarkdownDiv but it inserts another layer of interpretation of a small DSL that can be used to parametrise an element from within the script in a safe way.

The DSL (fun-dsl) captures Python-like function calls with named keywords only. So, something like myfunction(p1=value1, p2=value2) and so on.

The purpose of fun-dsl is to provide an easy way to parametrise a given element, especially in the case that these elements do not have a string return value.

This was necessary because the undelying template engine (jinja), works under the assumption that a given variable or function call will result into a string. In addition, it is difficult to “trap” the interpretation of a given tag in combination with a design that returns two results that serve different purposes.

Both elements operate with the extra and toc markdown extensions from the Python markdown package.

author:

Athanasios Anastasiou

date:

Jun 2023

class docvis.markdown.HTMLMarkdownDiv(markdown_template, context={}, external_resources=[], attributes={})

An HTML element that renders as a DIV and contains interpreted Markdown.

Parameters:
  • markdown_template (str) – Markdown interdispersed with variables and Jinja2 style tags that render variables from the context

  • context (dict) – A mapping of variable names to values

  • external_resources (list) – Any external resources required to render the content of this element.

  • attributes (dict) – A mapping of attribute names to their values

render()

Transform the element’s data into HTML.

class docvis.markdown.HTMLPreProcMarkdownDiv(markdown_template, function_table, mark_start, mark_end, context, external_resources=[], attributes={})

Preprocesses the template and substitutes specific commands with their result over a context.

Parameters:
  • markdown_template (str) – A markdown template string interdispersed with other tags

  • function_table (dict) – A mapping from the name of a function call appearing in the document and the computable name of the function. This ma,es

  • context (dict) – A mapping from variable names to values. This is passed verbatim to jinja’s render

  • external_resources (list) – A list of external resources required for this element (e.g. stylesheets, scripts)

  • attributes (dict) – Attributes for the top level div element. Boolean attributes are rendered without a value.

render()

Transform the element’s data into HTML.

Bokeh

Bokeh visualisation elements

author:

Athanasios Anastasiou

date:

Jul 2023

class docvis.bokeh.HTMLBokehBarPlot(x, y, is_vertical=False, **figure_params)

Visualises categorical count data (i.e. MeSH terms and number of articles they appear in)

render()

Transform the element’s data into HTML.

class docvis.bokeh.HTMLBokehElement(**figure_params)
render()

Transform the element’s data into HTML.

class docvis.bokeh.HTMLBokehLinePlot(x, y, **figure_params)
render()

Transform the element’s data into HTML.

Preprocessor

The fun-dsl preprocessor.

author:

Athanasios Anastasiou

date:

Jul 2023

class docvis.preprocessor.AstToValue(context=None)

Transforms the tree of tokens at the output of the parser to a computable value.

Parameters:

context (dict) – A dictionary that maps variable names to variables that are accessible for variable substitution

VALUE_IDF(n)

Return the value of an identified that should exist in the context.

value_idf_accessor(cap)

Return the value of an object at any depth

value_idf_field_accessor(cap)

Return the value of a dictionary at any depth

class docvis.preprocessor.EvaluatedSegment(result, original_string, substituted_string)
original_string

Alias for field number 1

result

Alias for field number 0

substituted_string

Alias for field number 2

class docvis.preprocessor.TemplatePreprocessor(function_table, context, mark_start, mark_end)

Preprocesses a string template and substitutes specific string patterns

Parameters:
  • function_table (dict) – A mapping from the name of a function to a callable that implements the call.

  • context (dict) – A mapping from the name of a variable to its value.

  • mark_start (str) –

    The opening paragraph marker (e.g. %\$)

    Note

    The start and end marker are embedded in a regular expression to track and extract the fun-dsl paragraphs. Therefore, if mark_start, mark_end contains special (for regexp) characters, such as $, these have to be escaped.

  • mark_end (str) – The closing paragraph marker (e.g. \$%)

Utilities

Helper functions and pre-configured elements making up the default functionality of DocVis.

author:

Athanasios Anastasiou

date:

July 2023

class docvis.utils.DefaultDocVisMarkdownDiv(markdown_template, context, fun_table_modifier={}, external_resources=[], attributes={})

A pre-configured HTMLMarkdownDiv with all the default functionality of DocVis.

Parameters:
  • markdown_template (str) – A markdown template using {{}} to reference variables in the context.

  • context (dict[str, any]) – A dictionary mapping variables to their values.

  • fun_table_modifier (list[str, callable]) – A function table that can add or override function definitions

  • external_resources (list[str]) – A list of external resources required for the content of this div to be rendered correctly.

  • attributes (dict[str, any]) – A dictionary of attribute to attribute value listing additional attributes to assign to the resulting div.

docvis.utils.bokeh_bar_plot(**kwargs)

A typical bokeh bar plot

docvis.utils.bokeh_line_plot(**kwargs)

A typical bokeh line plot

Docs

Docvis’ document infrastructure is used to render whole “documents” presenting data along with prose that sets it into context.

“Documents” are collections of HTML content that is generated by (usually) Markdown and data.

Organising documents

  • A Document is composed of Pages or other Documents

  • A Page is composed of text (including markup) and data

Both Pages and Documents have a logical name and a physical name and these names reflect the nested nature of Documents and their Pages. For example, suppose the following document structure where elements with a (d) correspond to documents and (p) correspond to pages.

main(d)
|
+--index(p)
|
+--results(d)
   |
   +--descriptive(p)
   |
   +--model_fitting(p)

In this structure, model_fitting is a Page and its logical name is main/results/model_fitting. Its physical name will be main/results/model_fitting.html but it can be overriden when a Page object is instantiated.

Instantiating document objects

A Page is instantiated by its logical name, template content and data context. For example:

Page("index",
     "# Welcome to this page\n\nHello {user_name}",
     {"user_name":"some user"})

For more details please see the Page class further below.

A Document is instantiated by its logical name and content, where content is a list of either Document or Page objects. For example:

::
Document(“main”,
[
Page(“index”,

“# WelcomennHere be {{animal}}”, {“animal”:”dragons”})

]

)

For more details please see the Document class further below.

Template utilities

Pages are DefaultDocVisMarkdownDiv objects with the additional availability of the template function link_to.

link_to accepts a link description and path to another page and creates the correct link in a page. For example:

d = Document("main",
             [
                 Page("p1", 
                      "# Page 1\n\nThis is page 1\n\n{{link_to('Page 2','main/p2')}}",
                      {}
                     ),
                 Page("p2",
                      "# Page 2\n\nThis is page 2\n\n{{link_to('Page 1','main/p1')}}",
                      {})
             ]
            )

class docvis.docs.Document(doc_name, content_elements, dir_name=None)

A document is composed of other Documents or Pages

Parameters:
  • doc_name (str) – A logical name for the document

  • content_elements (List[docvis.docs.Page]) – A list of docvis.docs.Page that represent each content page in the document

  • dir_name (str) – The directory name if it has to be set explicitly.

element_by_path(path)

Return a given element by its path or raise exception if it cannot be located.

Parameters:

path (str) – Path to a given element, from its root.

class docvis.docs.Page(page_name, template, data_context, extra_resources=[], file_name=None)

Resolves to an HTML page that renders text and data.

Parameters:
  • page_name (str) – A logical name for the page

  • template (str) – The Markdown content of the page

  • data_context (dict) – A mapping from variable names to values.

  • extra_resources (list[str]) – A list of external resources required for this element (e.g. stylesheets, scripts).

  • file_name (str) – The filename for this page if it has to be set explicitly.

class docvis.docs.RenderableDocElement(logical_name, physical_name=None, parent=None)

Base class for all objects that can produce hard-copies.

Parameters:
  • logical_name (str) – The logical name of the element.

  • physical_name (str) – The physical name of the element (i.e the name it will appear as in the file-system).

  • parent (Document) – The Document this element is parented to

collect_parents()

Collect all Document parents from a given level in the hierarchy and above.

get_fs_path()

Return the path of a given file on the filesystem

get_root()

Return the root object of a given hierarchy.