Towards a book printable jupyter notebook

TLDR: this page contains information how to produce from a notebook a PDF with:

  • Inclusion of input cells
  • Size A5 paper, one-column
  • Markdown images placed here [h] instead of floating
  • Two-page book printing and less vertical whitespace with the LaTeX book document class
  • Summarized in the template a5_book.tplx that is listed at the bottom of this post

front

This year I started to write lecture notes from the MOOC Analyzing the Universe with Python notebooks. It would be nice to have something tangible after finishing the course. The idea of having just one button to save a book-printable PDF was appealing. However, I also remember days from the past getting LaTeX content ready for print, and several days is prohibitively long these days. Except now it is holiday 🙂

Yesterday I time-boxed one day to turn a notebook into a PDF suitable for two-page print format. Though this endeavor is far from finished, the things I learned might help others save some time, and myself in the future for further refinement.

I started with the instructions provided by Making publication ready Python Notebooks by Julius Schulz. The main takeout of his article was to write an additional template that extends an nbconvert provided LaTeX template, and fine tune your own work from there. I needed some more fine tuning, since I definitely wanted to keep the input cells with astropy calculations. Another big difference with image conversion was that I do not have images that are output of cells such as plots; all images in my notebook are inline markdown images.

TODO for future versions

  1. Having equations numbered with the equation numbering extension that now is part of the jupyter contrib nbextensions project.
  2. Refine the python markup so cell input is rendered more like it is on the screen, or at least some thin lines are shown to indicate start and end of a code cell.
  3. Finding the right paper weight so images are invisible on the other side.
  4. Maybe a little big bigger than a5, perhaps b5paper with 8pt font size.
  5. Combine multiple notebooks into as chapter per notebook into one book.
  6. Replacing the [width=.8\maxwidth] default setting from upstream nbconvert and set it in my custom template or configurable per image.

Running nbconvert from the command line

! Package pdftex.def Error: File `images/week6_lecture1_m31_nebula.png' 
not found.

If your images are specified with relative paths from a notebook that is not in the root directory of the jupyter notebook server, you need to run the command jupyter nbconvert in the same directory as the notebook. Running from the command line is also required to specify a custom latex template.

/notebooks/Analyzing the universe$ jupyter nbconvert \\
--to=latex --template=a5_book.tplx Week\ 6\ Lecture\ notes.ipynb \\
&& pdflatex Week\ 6\ Lecture\ notes.tex

Paper size A5

I chose paper size A5 for the following reasons:

  1. On A4 the images, that are set to max 80% of page width by the nbconvert latex base template, are just too big, vertically and horizontally.
  2. On two-column A4 size, the python code is too wide so it overflows into the other column
  3. On single-column A5, the width is right for most python code, and images do not become too large.

Images part of the markdown notes

Make images non-floating

With the following markdown syntax an image can be included:

![Caption text](image/file/location.png)

When the notebook is rendered on the screen, images appear between the text at the place where they are declared. In the default notebook PDF rendering of markdown images, the images are floating; they do not follow the normal stream of text. I did not like this at all, since it differed much from the way the notebook was edited and rendered in the browser. The fix requires understanding of the notebook to PDF conversion process. When the notebook is converted to PDF, it is converted by Pandoc to LaTeX and from LaTeX to PDF. The LaTeX image declaration looks like this:

\begin{figure}

\centering
\includegraphics{images/week6_lecture1_m31_nebula.png}

\caption{Location of M31 in the sky}

\end{figure}

The default image configuration in LaTeX is to make it a floating image, and this means that the images are entities separate from the text, and can even be placed on a different page than the text that it was originally placed between.

A couple of solutions are available. One is to edit the markdown and put \ after each image, which causes Pandoc to skip the \begin{figure} and use only the LaTeX \includegraphics{} declaration. A lot of work. Another solution is to generate or post-process theLaTeX file so all images are declared with \begin{figure}[h].

The solution that I could use, from Tex Stack Exchange post Latex Figures appear before text in pandoc markdown, was adding two following two lines to the LaTeX template header, to place all floats ‘here’ with the [H] argument:

\usepackage{float}
\floatplacement{figure}{H}

In the LaTeX document images are still declared without [], but all images do not float.

Controlling image width (fail)

Default image size is set by the nbconvert latex base template to 80% of the page width. It is possible to control the width of a markdown image declaration with appending {width=..}, as is shown below.

![Computing the plate scale](images/week6_lecture2_platescale.png){ width=50% }

Pandoc will correctly translate the width command, but together with the width specification already given by the base template, this duplicate width specification is incorrect LaTeX code that gives the following error:

Runaway argument?
width=.8\maxwidth ][width=0.50000\textwidth ]

The failure to make images smaller was one of the reasons for me choosing A5 paper size.

Book LaTeX document class

I switched from article to the book document class for these reasons:

  1. The book document class is meant to be printed on two pages; it can keep track of different inner and outer margin sizes.
  2. The book class also takes care of creating a separate title page, and starting the actual content on the third page, so if you let it print with a different kind of paper for the cover page, the title page alone is printed on the special cover paper, and the actual content starts at the first normal paper.
  3. The book document class produces less pages with much vertical whitespace. With the article class some of the pages had many vertical whitespace between text boxes, which was probably caused by my choice for A5 paper size and non-floating images. The book document class uses \flushbottom where and article class uses \raggedbottom. For more information, refer to Tex Stack Exchange post Why does latex stretch small sections across the whole page vertically?

Prevent \geometry to reset book margin settings

The nbconvert latex base template contains a \geometry command that redefines all margin settings. To keep the book inner and outer margins, I used the xparse package to renew the \geometry command to do nothing, to prevent future \geometry commands to reset the book template margin settings.

\usepackage{xparse}
\RenewDocumentCommand{\geometry}{om}{%
}

Top level division from section to chapter

The Book document class introduced a side effect, which was that the top level devision of the document is no longer ‘section’ but ‘chapter’, and chapter name and numbers are printed in the header. While Pandoc has the setting –top-level-division, I could not find an option to convince nbconvert to let Pandoc use section as top level division. In the python code the call to pandoc has a kwargs argument for additional arguments, but nbconvert –help-all does not convey this option. The best way to use the book document class is probably to write a book.tplx as alternative to nbconverts own article.tplx, however since I was time-boxed, I put the chapter counter and title in the notebook metadata, and let the a5_book.tplx produce the content of these variables in the LaTeX document, resulting in code like this:

\setcounter{chapter}{5}

\chapter{Week 6}

Notebook meta data for title page and chapter info

Go to the notebook -> Edit Menu -> Edit Notebook Metadata and create a JSON property “latex_metadata” to specify the variables that can be used by the custom LaTeX template. Note that the object format is strictly coupled to the custom template a5_book.tplx below, it is not read by any other software. The latex_metadata object in my notebook looks as follows:

{
  "language_info": {
   ...
  },
  "latex_metadata": {
    "affiliation": "Rutgers the State University of New Jersey",
    "title": "Analyzing the Universe, Week 6 lecture notes",
    "author": "Dr. Terry A. Matilsky",
    "chapter": {
      "setcounter": 5,
      "title": "Week 6"
    }
  }
}

Example pages printed

This was printed and shipped for less than $10,-. Next time I’ll ask the print shop which paper is suitable for full colour print and preventing images on the other side to shine through. Still, I was quite pleased with the result!
content

a5_book.tplx

The final template I ended up with was the following:

((*- extends 'article.tplx' -*))

((* block docclass *))
\documentclass[9pt, reprint, floatfix, groupaddress, prb, twoside]{book}

% Use a wider inner margin for the two-sided book
\usepackage[a5paper, margin=0.5in, inner=1in]{geometry}

% Ignore future geometry commands with optional and mandatory arguments
\usepackage{xparse}
\RenewDocumentCommand{\geometry}{om}{%
}

% Let all figures float 'H'ere
\usepackage{float}
\floatplacement{figure}{H}

((* endblock docclass *))

% Author and Title from metadata
((* block maketitle *))

((*- if nb.metadata["latex_metadata"]: -*))
((*- if nb.metadata["latex_metadata"]["author"]: -*))
\author{((( nb.metadata["latex_metadata"]["author"] )))}
((*- endif *))
((*- endif *))

((*- if nb.metadata["latex_metadata"]: -*))
((*- if nb.metadata["latex_metadata"]["title"]: -*))
\title{((( nb.metadata["latex_metadata"]["title"] )))}
((*- endif *))
((*- else -*))
\title{((( resources.metadata.name )))}
((*- endif *))

\date{\today}
\maketitle

((*- if nb.metadata["latex_metadata"]: -*))
((*- if nb.metadata["latex_metadata"]["chapter"]: -*))
((*- if nb.metadata["latex_metadata"]["chapter"]["setcounter"]: -*))
\setcounter{chapter}{((( nb.metadata["latex_metadata"]["chapter"]["setcounter"] )))}
((*- endif *))

((*- if nb.metadata["latex_metadata"]["chapter"]["title"]: -*))
\chapter{((( nb.metadata["latex_metadata"]["chapter"]["title"] )))}
((*- endif *))
((*- endif *))
((*- endif *))

((* endblock maketitle *))
Advertenties

Geef een reactie

Vul je gegevens in of klik op een icoon om in te loggen.

WordPress.com logo

Je reageert onder je WordPress.com account. Log uit /  Bijwerken )

Google+ photo

Je reageert onder je Google+ account. Log uit /  Bijwerken )

Twitter-afbeelding

Je reageert onder je Twitter account. Log uit /  Bijwerken )

Facebook foto

Je reageert onder je Facebook account. Log uit /  Bijwerken )

Verbinden met %s