A venture into the world of document formats and formating documents
Over most of written history formatting documents was done by the person copying it down by hand on a piece of paper. Then printing emerged which
birthed the discipline of typesetting. Nowadays with multiple different screen sizes and html
as most common type of document I think it is
important for me to figure out how these documents work and how I want to set up my environment to produce the type and quality of documents that
I want to produce.
Formats are hard
I'll say it loud and I'll say it clear, document formatting is confusing! There are seemingly a million formats for documents and a dozen tools for each format with the more popular ones having thousands of tools ranging offering a wild variety of extra functionalities and perks trying to help us helpless humans to get something, anything done.
The thing is that it stands to argue if we really need these formats and then in the end what to do with all the variety out there? As it is really not my place to say what any of the other humans on this planet do or don't need (even though I do have fairly strong oppinions on that matter) I'll treat the latter part of the question.
For some of you, especially the ones that have not really investigated the realm of computers further than the browser and maybe the Microsoft Office suite of programs (and maybe for the mac users out there the mac suite of similar programs) this might come as a bit of a surprise. You have then lived a rather blissful (and maybe somewhat aggrevating) life for the waters are wide and deep what this topic is concerned.
Documents surround us. In the end (at least in the linux philosophy) everything (in our computer memory at least) is a file (there are even files that contain in them the entirety of a computers memory) and every file needs a format to be understood. In this context the analogy to languages beckons (however the concept then becomes a bit circular in the abstract as languages are also formats and you can again format languages (please read a book on information theory if you are interested)) so I'll try and use the intuitive understanding to build a more abstract one.
A brief dip into the theory
A written language (I'll be sticking with them for now, but I'm sure the following can be generalized to all languages) has (at least as far as I know) a few basic elements:
-
A character set. This is the set of all allowable symbols used in the language. For English this would comprise of the latin alphabet in lower and upper case, the arabic numerals and of course all the punctuation symbols like brackets and commas and full stops (and of course all the things that I have inevitably forgotten but I'm sure you got the point).
-
A grammar. The grammar is the set of rules that have to be followed in order to build correct constructs out of those symbols (you might now be thinking about punctuation rules and such and you would be correct, there are however even more basic rules that in the end are just as important like letsimaginethatIforsomereasondecidedthatIdidnotneedanyspacesinbetweenmywords.)
- Spelling also is a part of grammar, as it defines in what combination letters are alowed to appear to make sense (and by extension convey meaning) and conveying meaning is of course the whole point of preparing documents.
From these basic building blocks all formatting is crafted. The complexety of the formats can vary from terribly simple to terribly complex (for the latter think human language and for the former truth statements in basic mathematical logic). The thing in the brackets there may have startled you as you probably think that you can fluently interpret natural human language but have maybe allways struggled with math, but that is only due to your really cool brain that somehow can do all of that proccessing that is going on in your head right now seem absolutely effortless as you have been able to talk for such a long time. The truth is however that structually there are only very few ways in which to combine the letters of mathematical logic (not the entirety of math remember just the tiny bit that is formal logic) but billions of ways to build a correct english sentence.
Then there is the science, art and craft of preparing documents and everything associated with that. In particular documents now are phyisical manifestations of something akin to language that is frankly way more complicated than language (I am ignoring the non verbal ques that humans use and (normally) understand when taliking to another human, these complicate the description of inter human communication by a lot as these patterns are so old that we can' even really describe them as they are baked into our brain on such a fundamental level we simply develop feelings when we see them but (at least without adequate training) normally entirely fail to recognize them). The reason that Documents are normally more expressive than a language, is that they have the entire 2D plane to play with. This means colors, pictures, text (which is the closest that we come to an abstract language) and fonts and the question how to arrange everything in a way that is pleasurable to our sense of style without sacrificing on legebility to much.
restricting your choice
The thing with documents (I will be referring to the 2D kind that can be printed (I am willingly ignoring Video, audio and other multi media formats)) is that it is now virtually impossible to describe either the alphabet, or the grammar. (I can simplify the alphabet to the different colors that a single dot can be printed in, which still gives me close to a few million options and makes the grammar terribly terribly unfathemable). One possible solution to this it to arbitrarily restrict oneself to a very specific subset of all possibilities and see what we can acheive there.
So this is what I want to do too. I have, for a long time now wanted a place that can acommodate my thoughts and parts of the knowledge that I have learned over the years. It should be a repository for me and others to look things up in and generally record my thoughts about the world. The question that I have faced and still face is how exactly to do that.
Now that I am getting a bit more concrete, there emerges another choice to be made, the medium. Some say the medium is the message, but I'd like to consider the two things related but not neccesarily identical (you can start your own website or write me an email telling me how wrong I am if you disagree). So to decide the medium there are plenty of options. As I mostly write I think something like a website or a printable document would be nice as a medium and I'll probably stick to the conventions of websites on the internet (so pages are oriented vertically and are mostly either navigated by bottons or by scrolling and then scrolling is linear in (left handed screen reference system) y. Even though I am starting to come up with some interesting ideas now that I am thinking of what and how I would like to do things (you will have to come up with them yourselfe or wait until I'll be able to present them to you.
So I would like a webpage and something that I can possibly print. So far so good. Now what formats are there to do that? Let's look at websites first.
The WWW and it's perils
As far as I was able to find out the world wide web as I have grown up with was developed by scientists (Tim Berners-Lee to be precice) because they wanted an easy way to reference other research papers without having to go through a lenghty excavation process to find the paper in question. The main feature was essentially the Hyperlink that was nothing more than the address of the paper on the computer network (something that I beleive is now called a Unified Recource Locator or URL for short).
the www
More traditional Web pages have adopted the "page" metaphor from books and other printed media but are not neccesarily restricted to that and have evolved quite considerably over time. Sometimes you can still find those quaint little self hosted blogs out there on the internet where someone has simply (like the thing you are reading now) put things on a computer and let other people access those recources over the larges single computer network ever to have existed in human history (aka. the internet). These sites are in my oppinion the things that are the most true to what the world wide web (I think the www is distinctly different from the internet) was intended to be by the "original" creators at the time of it's inception. For better or for worse it has outgrown that by now. For me the www in it's original form however is at it's most beautiful because it (at least for me) contains much of the good intentions and naive dreams as a place of coexistans and self expression by the people that initially used it. For me this early internet embodies the spirit of shared knowledge, cooperation and communication that has the capability (mind you only really when done right) to make humanity more or less invincable.
I think this was however only really due to who was actually using the internet in it's very early days, mostly computer enthusiasts and people fascinated enough by technology and well off enough that they could afford all the time and recources to actually own and operate what where terribly expensive machines, or on the other hand institutions that where mosly populated with scientists and engineers that (at least in my experience) are sometimes woefully naive (which is why I like them so much). The kind of people that life has been good to, that never really (I'm talking on the scale of poverty hunger and famine) had any problems (I know how much misogeny and other exclusionism there was and still is in science and engineering and as a member of that group of people let me say that for my part I am sorry about that, for we have sometimes learnt from the best (some may have been bullied a bit to much) when it comes to exclusion and psychological warfare and bullying) and therefor could allow themselves that naivite without even thinking about the consequences (that's another whole article right there).
Anyway back to the thing I wanted to say... "Traditional" web pages will probably suit me well, as they emulate printed text to a large degree and come with some extra little features (like full text search and stuff). So now the question of how to make a website...
it's perils
This is where the uhmmmm... bit starts. You are reading this on a website and therefor I have found at least one way that brought this document to you to let you read it. (If you
want to know how that happened for the very bits and bytes that you are reading now I'd invite you to press the <F12>
key and have a look what pops up (yout browser should be
selected or better said should be the active window (don't get me started on windowing systems and the underlying metaphores and problems).
So I somehow need to write the things I want to write in this HTML
stuff. Ok, can't be that hard ..... Oh ups. Well to produce a website that you can read is fairly simple.
Sadly however I am now at the technological level of the mid to late 90's. Remember how websites looked like back then (even NASA's JPL did not (really) know what they where doing). Compare that with today's version and you can see that that was more due to the novelty of the www
than anything else.
I would however have a website that you'd also want to read. So I'd like the text to look nice and be in a coherent font and have matching colors and a useful menu and, well have it
look decent. Sadly however, this either means learning a *lot more about HTML
and the very closely related CSS
and (if you want anything resembling fancy effects (which I am
trying to avoid as best I can)) the dreaded JavaScript.
To the non-tech person reading this I'd like to include here a quick browser anatomy lesson.
Probably the most critical piece of Software on the planet (after Linux that is)
The web browser is everywhere. Even though there are but a hand full of web-browsers that have acheived some sort of name recognition, There are 4 (actually more like 3) that at the time of writing are in wide spread use. Microsofts EDGE, Mozillas Firefox, Apples Safari and last but definitely not least Googles Chrome. I say actually 3 because Edge and Chrome are at this point in time actually just the same browser with a bit of a diferent menu style. Behind the user interface is exactly the same software (UIs are like cloths you can put on top of software (if it is well designed at least), it may look different but in the end it's the same thing underneath (this analogy also works astoundingly well for cars and hoods)).
The thing is that web browsers have evolved to the point where they now include a little virtual computer for each tab of your browser to run all of the software that you normally download with most webpages nowadays. This is one of three major components that make up a browser. In total there are these rough parts:
-
A UI and management component. This is where you can set your settings, where cookies are kept in the cookie vault (are you also thinking of a bank vault filled with cookies, if so, me too) and where you can set the colors of the tabs and background and store your bookmarks and such. This is where your various passwords are stored and rememberd and so on. This part generally is the part all the tech youtubers obsess about because they either have not taken the time to take apart a web browser or they have realized that no one cares about how the thing really works beyond which buttons you can press and what they do (for some I think argument 2 is way more compelling but sometimes I hear things that make me think that they themselves have never really asked any futher than that.). Frankly this part gets confusing fast, because it is trying to interact with humans (generally a really complicated thing to get right and even more complicated thing to get right and be consistent, esepecially if like 4 billion people use it). A bit of advice here, UI people are wizards, if you ever write software I beleive that writing kernel drivers is less complicated than building a good user interface, so if you can, avoid writing UIs.
-
A rendering engine. This part takes this
HTML
stuff and turns it into something you can read (or at least look at) on a screen. It has to figure out how the text looks, how large the letters have to be and where to put them and which color everything should have, if anything on the website has to move or spin or something and on and on and on. It has to take all kinds of (mostly 'text' based) files and turn them into various parts of the things you see on the screen when you open up a web browser. This part is mostly terribly mathematical and encorporates probably thousands of computer science PhDs and other work. Even the (simple?) task of rendering a letter on your screen is really really complicated if done in a way that works almoast on every computer you can think of independent of screen size and resolution and other factors (color on screens is also terribly complicated but this would lead to far), which it does. -
the JavaScript Engine. This part is the (not so) little virtual computer inside every tab. It's job is to run all the software that more often than not is downloaded with the webpage and subsequently executed by the Engine. (For Chrome and Edge that engine is currently called V8) It can if you want do anything a normal computer can see (here)
-
The Document Object Model (DOM). The DOM is something of an abstract thing and it lives partially in the rendering engine and partially part of the JavaScript engine. The DOM for the browser is essentially like a filesystem for a computer. All the elements that the rendering engine displays (and also the ones that it is not displaing live in the DOM. I think the right terminology is that the DOM is an object store. The code in the Javascript engine can access and alter these objects, and by altering them change what the display engine is rendering. This is how browsergames work (the game part is the part that runs on the Javascript engine.
Deciding how to build my site
I started off writing html by hand and building small amounts of CSS to make it look somewhat acceptable. However writing good looking HTML is quite a challenge especially after I started wanting to build something with a bit more structure in the overall site (index pages and the like). It would have involved writing templates (pre written HTML files that I could then use to fill them with content wrapped in the appropriate HTML by hand). Nowadays there are tools that do exactly that and therefor I tried out Pelican (a site generator written in python). I played around with pelican but could not really understand the structure that the tool was trying to impose on the site I wanted to build and that got somewhat frustrating.
The website that you are reading right now is generated from CommonMark markdown using the zola static site generator. Using CommonMark has the benefit that it is fairly easy to write and the syntax is simple and relatively clean. Zola only imposes some small amount of extra metadata to display the individual articles correctly but otherwise it seems to get out of my way which is quite satisfying. There are a host of options for "Themes" that impose a common style on the entire site I would at one point wan't to build a custom Theme but that involves copious amounts of HTML and CSS knowledge that I currently don't know but may learn at one point.
I primarly write text. I would like that text not only to be understandable and over all not (terribly) wrong but to look appealing and inviting. Furthermore I don't only write text in english language, but also in mathematical scripture and would also want to be able to print that on a sheet of paper. Graphs and plots can and hopefully will make up at least part of what I want to communicate (they also can help make things clearer for me) so It would be nice if that would work too. This seems to be rather complicated as webpages generally don't understand mathematical markup. To remedy this people have kindly adapted the TeX typesetting system to do it's magic as a javascript program that takes specific elements of the dom and translates these elements into typeset math in a format that the browser can actually display natively (like svg or jpg or some such thing). As I currently have not integrated the appropriate code into my static site generator this feature is not yet enabled. I may temporarily switch to a different theme that allready has that built in.
So all in all I now have a toolset that can generate consistent HTML with a nice CSS. The thing that does not work currently is TeX like math but I hope to add that in the future. I don't know if I at one point will split the site into multiple subsites each with their own theme so that I can adapt the style to what I want to say (changing to a LaTeX like style when writing something sciency and switching to something more code-style like when writing about computers, sadly when I do so I'll loose cross search capabilities but that's a compromise I am willing to make). In general I would like to get into how to prepare documents for the web properly but I've read through enough documentation as it is and that would be even more docs.
Printed Documents
Printed documents have some clear advantages when compared to the web. I know how large the page is beforehand. This lets me be a lot more specific about layout and other things. The problem is that writing postscript (the language that at least the good printers can understand) is fairly tedious and that would not lead to much. So again I am looking for a Toolchain that can do the desired thing and not make me to frustrated.
Postscript and it's cousin the PDF
Nowadays there is a de facto standard for representing all kinds of printed documents on a computer, PDF or portable document format. As far as I know it is an adaption of postscript that tries to get rid of computationally intensive tasks (like looping and branching) in the postscript language trading the computing time for filesize (A PDF will be larger than the corresponding postscript for an identical document). So best bet is to set the target of my toolchain to PDF. Now the most satisfying thing would probably also be writing CommonMark for the things that I want to print and then simply run them through some sort of converter that spits out a PDF.
As far as I have found out producing a satisfying PDF from any input seems quite a challenging task. Thankfully many smart people have worked on various programs to help me with this. The favorite by far for properly typeset layout is the TeX engine by Donald Knuth (originally at least) that has been modified into different versions over the years (TeX is the only program that is actually written by Knuth, all others seem to inherit parts of the source but chose to alter it in one way or the other to enhance the capabilities of the "original" TeX engine with some feature that is of interest to either a subgroup of it's users or some sort of maintainance improvement to help developers to continue working on the various flavours of TeX that are subsequently not allowed to be called TeX and call themselves something similar (like LaTeX for example).
Meet TeX
Other than TeX I don't know many tools that can produce satisfactory PDF output form a given Input. LibreOffice Writer and it's companions do not offer the clear structure that explicit tags or tokens allow for and for me that clear declaration of what the specific type of text is is important for me.
Another thing that I like about TeX (I'll get to the details of this later) is that I can use plain text to write the source files for TeX in which in turn means that I can use the tools that are now well known to me like neovim. This means I don't have to learn another tool to do a job but can use the time gained to focus more on a tool that I then can use for multiple purposes increasing productivity in all areas that are effected by using a single tool. This is definitively a big plus in my book. (Even though I have to be honest that I have not spent enough time on the tools that I use to be able to actually use them to close to their full potential)
TeX is however a bit of a monster in it's own right. There are multiple layers that belong to the TeX that can be installed on most normal computers. I happens only seldomly nowadays that a user is confronted with a bare tex engine and nothing else. In TeXLive for example a whole lot of so called macro packages are additionally downloaded and installed so that they can be called by the TeX files that describe a specific document. Furthermore most people don't actually use TeX but they use LaTeX which is itsself a macro package to abstract a lot of the nitty gritty details of the TeX engine away. Then on top of LaTeX many more macro packages have been written that provide some specific functionality that would otherwise have to be implemented by hand. They define things like how exactly figures and images are included in the final PDF or provide a way to draw graphics via commands in a TeX language (look for PStricks).
On Top of all that people have also provided editors and IDEs that can be used to make this specific workflow easier for TeXnitians of all ages. These provide a build environment and a set of pre defined templates for a quick start writing TeX files. I personally like a single tool for a single job and as the files that I write are textfiles I use my texteditor of choice which is of course neovim.
LaTeX and ConTeXt
There is However not only the LaTeX way of doing things with the underlying TeX engine (of which there are many different versions) but also ConTeXt. It has a slightly different focus. LaTeX tries to abstract away the details of the typesetting task from the user. This is very helpfull if the user is for example submitting an article for a journal. In that case, the experts of the publisher of the journal prepare a LaTeX class file and give that class file to the different researchers who want to submit a paper to the journal. These researcher now only need to know the structural macros needed to give their text the neccesary structure and the class file takes care of formatting everything in such a way that the expert eye is satisfied.
This is however not my most common usecase. I actually like typesetting a lot so I take some fancy in defining my own layout. To make things more complicated most LaTeX macro packages are really confusing when it gets to interoperability because as far as I know there is no namespacing in the underlying TeX engines which can lead to different packagese overwriting each others macros. Also essentially every package has it's own terribly long manual and usage requirements that I'd like to avoid reading over and over again (I allready have enough to read when I consider the manual of nvim or any other tool I want to use and that is whithout the allpresent extensions that I would like to use with nvim.
The Choice I have come to for now is trying out ConTeXt and seing how I can make decent layouts and configure it to my liking and how to make projects with that. There is a bit of a question on how to encode the knowledge gained reading all the different manuals. As I am aiming at using a Shell for most parts of my work and subsequently trying to find shell tools to get the things done that I want to do. This has been a fairly successful venture so far. I now need to decide how to bind everything together. But that is a question for another article entirely.
So There it is for now. I will fiddle with the after-Dark template for my website that is generated via zola and use ConTeXt for generating pdf. All the while I will use a shell and Text editor to do the actual writing in and may figure out a way to convert CommonMark into ConTeXt style TeX while running everything in the shell (maybe makefiles but I really don't know if I wan't to do that). and probably using the shell to write scripts that automate the most common tasks.
I hope you where able to follow my reasoning and that I have been able to give you a few Ideas that you can think about. If you have questions or comments you can allways send me a mail.
Addendum
I have now started to look in to the ConText environment. I am fairly delighted to see such nice makro declarations that have a consistent style and feel to them and in general encompass all one might want to do with a TeX typesetting enginge. I think the most stark difference is that ConTeXt comes with a ready made consistent environment, while LaTeX needs to load a whole bunch of makro packages to be able to do relatively basic typesetting things making LaTeX setup a good bit more complicated. The ConTeX Garden wiki is also a fairly consise and good recource for finding authoratative information on how to get things done. This is in my mind superior to having to read through all the makro package manuals that are provided with them on top of having to figure out what LaTeX actually does.
A further little detail that ConTeX offers that I have not really been able to find a counterpart in the LaTeX core is the ability to show the margins and other typographical areas by setting a flag in the setup area. This makes it quite easy to see how and where changes to the layout come into effect. I know there should be packages in LaTeX that do the same thing, but as there are no namespaces for makros I quickly get the feeling that they might intersect and that gives me a bit of the hebejebes. So that's that.