Switching apt documentation away from docbook

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Switching apt documentation away from docbook

Julian Andres Klode-4
APT's documentation is currently written in docbook. docbook
is a huge XML mess that basically nobody can read or write,
and it would be nice to replace it with something readable.

# Contenders

There are three file formats:

-  Markdown
-  reStructuredText
-  asciidoc

None of them are perfect, all have problems.

# Problems

-  We need to be able to reference common options and other
   stuff we have in apt.ent and apt-vendor.ent. I mean we
   can run a pre-processor on our files to substitute variables,
   it should not be a huge issue I guess (like we could just
   use XML entities I guess)

-  We do not want too many dependencies. Our contenders
   use Python or Haskell.

-  The guides: I don't think we actually need all the docbook
   guides we are shipping. The ones in apt-doc certainly could
   be folded into apt as manpages (the parts needed), and the
   ones in libapt-pkg-doc could be internal markdown files.

-  We lose all existing translations

# reStructuredText / Sphinx

We could build all documentation, including manual pages,
using Sphinx. This would give us nice HTML guides, and
they can even include manual pages:

https://people.debian.org/~jak/apt-doc/index.html

Sphinx provides built-in support for i18n, however there
are some caveats:

1) manpage titles are not (easily) translatable, as they are
   defined in conf.py (could do gettext inside conf.py, though)

2) assuming we'd have user documentation split like this:

   index.rst  guide/*.rst  man/*.[1-9].rst

   we get the following POT templates:
     _build/gettext/index.pot
     _build/gettext/guide.pot
     _build/gettext/man.pot

   and translations follow the same
     locale/$LOCALE/man.po

3) it's huge

An alternative converter for manpages would be rst2man,
which involves a lot less dependencies. This however does
not help with translations.

There also is an .. include:: directive in rST, which
might be useful to replicate (some of ) the entities.

# Markdown

We already use Markdown for a few files. Pandoc is a good
converter for Markdown to manpages, but it is written in
Haskell, which might be crap for us.

I also looked at cmark, but it does not handle definition
lists, so it's unclear how you'd map option lists.

Problems:

-  Markdown has some limited support by po4a, but we
   are definitely missing definition lists there, so
   translating our option lists is hard right now.

-  The syntax is fairly limited

# Asciidoc

A third possible contender might be asciidoc; but it is
a fairly niche format and probably does not require a lot
less dependencies than other options.

--
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer                              i speak de, en

Reply | Threaded
Open this post in threaded view
|

Re: Switching apt documentation away from docbook

Julian Andres Klode-4
On Wed, Aug 21, 2019 at 06:42:09PM +0200, Julian Andres Klode wrote:

> APT's documentation is currently written in docbook. docbook
> is a huge XML mess that basically nobody can read or write,
> and it would be nice to replace it with something readable.
>
> # Contenders
>
> There are three file formats:
>
> -  Markdown
> -  reStructuredText
> -  asciidoc
>
> None of them are perfect, all have problems.
>
> # Problems
>
> -  We need to be able to reference common options and other
>    stuff we have in apt.ent and apt-vendor.ent. I mean we
>    can run a pre-processor on our files to substitute variables,
>    it should not be a huge issue I guess (like we could just
>    use XML entities I guess)
>
> -  We do not want too many dependencies. Our contenders
>    use Python or Haskell.
>
> -  The guides: I don't think we actually need all the docbook
>    guides we are shipping. The ones in apt-doc certainly could
>    be folded into apt as manpages (the parts needed), and the
>    ones in libapt-pkg-doc could be internal markdown files.
>
> -  We lose all existing translations


Another thing to consider is that, optimally, we'd like to
have our files auto-formatted to avoid style conflicts. I guess
like pandoc can output the input format, so that might work.

--
debian developer - deb.li/jak | jak-linux.org - free software dev
ubuntu core developer                              i speak de, en