================
DocVis internals
================
Users of DocVis do not *have to* know the internal details of DocVis' design but
its awareness can help in understanding why the API looks like the way it does.
This section begins with a very brief introducion to the problem that DocVis deals with,
followed by the design of DocVis.
What was DocVis designed to do?
===============================
DocVis was designed to create standalone HTML documents **with** interactive plots.
These are documents that allow a degree of data exploration, along with necessary text
that is required to embed what the plots show to a specific context.
In its primary use-case, DocVis is used to take results from various analysis methods and
add around them the text that is necessary to make sense of the results *in a specific scientific context*.
Here is a ficitious but realistic example:
* Suppose a database of categorical data.
* For example:
* ``Dataset1 : {A,B,F,B,A,F,F,A}``
* ``Dataset2 : {1,2,3,3,2,3,1,1}``
* These data could be organised in to this database from various processes (e.g. financial, engineering and others).
* Suppose further that one of the most common operations across such datasets is to obtain
a count of each category in the dataset.
* For example, asking for a count on ``Dataset1`` would return something like:
``Dataset1_Count: {A:3, B:2, F:3}``
* A user has the ability to request the "counts" of a specific dataset and plot the results
in the form of a histogram [1]_.
* Note here that if we were to write this directly in Python, all we would have created
would be a diagram.
* Upon obtaining that histogram, they also want to be able to write some `prose `_
around that histogram and forward it to non-technical people who are interested on the
results and their impact on their field.
It is possible to do this by:
1. Having 4-5 documents open and copying and pasting between them; or
2. Some kind of bash script automation; or
3. Some kind of "dashboard application" and so on.
Each one of these solutions has pros and cons, but *none* of them would be as simple as
a plain standalone HTML file that can be shared around in an email or easily be served
and updated dynamically if required.
This is what DocVis was designed to do.
.. note::
Obviously, there are more details to DocVis' use case but for the moment,
this is the most compact way of explaining the use case that gave birth to it.
Why not use a template engine?
==============================
It is difficult to answer this question without first considering the data structures that
represent each entity, whether that is pure HTML, Markdown or plain text.
The key problem one might come across with a template engine is that:
*There is no way to customise the appearance of non-string elements*
Template engines assume, throughout their design, that all actions of the template language
are supposed to lead to a string representation which ends up in the output. This is true
whether the template languages does "complex" operations (such as iteration through a
data structure), or less complex operations such as content filters and formatters.
Dynamic plots **are** represented as strings (e.g. HTML tags) but also carry another
element that is essential for their proper rendering.
If that was sent through a template engine, it would attempt to render an object with
two properties.
Therefore, a way was needed to "intercept" the templating process to create the dynamic
plots on the fly.
So, how does it work?
=====================
Docvis recreates a subset of the HTML DOM, small enough to be able to generate
simple HTML pages.
Each HTML element, accepts "content", attributes and "external dependencies" and when
it is rendered it produces "HTML code" and "external dependencies".
.. graphviz::
digraph foo{
graph [rankdir=LR, splines="spline", nodesep="0.8", bgcolor="transparent"];
node [shape="plaintext", width=1.5];
edge [color="#283044"];
node_content [label=, fontcolor="#D52941"];
node_attributes [label=, fontcolor="#D52941"];
node_external_deps [label=Dependencies>, fontcolor="#283044"];
node_html_element [label=, fontcolor="#2191FB"];
node_html_code [label=, fontcolor="#D52941"];
node_html_deps [label=Dependencies>, fontcolor="#283044"];
node_content:e -> node_html_element:w;
node_attributes:e -> node_html_element:w;
node_external_deps:e -> node_html_element:w;
node_html_element:e -> node_html_code:w;
node_html_element:e -> node_html_deps:w;
}
Everything is self-explanatory in this diagram, except perhaps "External Dependencies".
"External Dependencies" are any resources that:
1. Do not participate in the rendering of the HTML element
2. Are absolutely required for the element to be rendered properly.
Here is an example for producing the following HTML snippet:
.. code-block:: html
Danger, Will Robinson!
.. graphviz::
digraph foo{
graph [rankdir=LR, splines="spline", nodesep="0.8", bgcolor="transparent"];
node [shape="plaintext", width=1.5];
edge [color="#283044"];
node_content [label= Danger, Will Robinson!>, fontcolor="#D52941"];
node_attributes [label= class=“danger “>, fontcolor="#D52941"];
node_external_deps [label=Dependencies
[“stylesheet.css “]>, fontcolor="#283044"];
node_html_element [label= div>, fontcolor="#2191FB"];
node_html_code [label= <div class=“danger“>
Danger, Will Robinson!
< /div >>, fontcolor="#D52941"];
node_html_deps [label=Dependencies
[“stylesheet.css “]>, fontcolor="#283044"];
node_content:e -> node_html_element:w;
node_attributes:e -> node_html_element:w;
node_external_deps:e -> node_html_element:w;
node_html_element:e -> node_html_code:w;
node_html_element:e -> node_html_deps:w;
}
Although this is not a very interesting example, if we define a ``div`` ``HTMLElement``,
we have all the necessary information to generate an HTML page that would render it
properly.
Where it gets more interesting, is when we start nesting HTML elements as in the following:
.. code-block:: HTML
In this case, "Content" is an ``HTMLElement`` itself, resulting in:
.. graphviz::
digraph foo{
graph [rankdir=LR, splines="spline", nodesep="0.8", bgcolor="transparent"];
node [shape="plaintext", width=1.5];
edge [color="#283044"];
node_content [label=>, fontcolor="#D52941"];
node_attributes [label= class=“danger “>, fontcolor="#D52941"];
node_external_deps [label=Dependencies
[“stylesheet.css “]>, fontcolor="#283044"];
node_html_element [label= div>, fontcolor="#2191FB"];
node_html_code [label=
<div class=“danger“>
<p class=“personalised“>
Danger, Will Robinson!
< /p > >, fontcolor="#D52941"];
node_html_deps [label=Dependencies
[“stylesheet.css “,
“person_messages.css “]>, fontcolor="#283044"];
node_content_p [label= Danger, Will Robinson!>, fontcolor="#D52941"];
node_attributes_p [label= class=“personalised “>, fontcolor="#D52941"];
node_external_deps_p [label=Dependencies
[“person_messages.css “,
]>, fontcolor="#283044"];
node_html_element_p [label=