Ghostscript

At the time of writing, the current ghostscript version is 9.56.1. Full code described in the article, can be found here.

Ghostscript is an excellent interpreter for Postscript and PDf files. It is installed on almost all modern Linux distributions per default.

In this article, I would like to present how to unify multiple Pdfs into one file, create a bookmarks list inside it and add page numbers on each page of the generated document.

Combine files

Firstly let’s start with combining Pdfs into one file. It is relatively easy and can be done with the following command:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf first_input.pdf second_input.pdf

Where -dNOPAUSE and -dBATCH disables interactive prompts, -sDEVICE chooses a so-called output device (all ghostscript devices can be found here) and -sOutputFile sends the output to file. Full reference.

Pdf bookmarks

Once we know how to combine the files, let’s continue with generating the Pdf bookmarks. In order to generate bookmark in combined Pdf, we need to know what will be the position (page) of the bookmark in the combined file. This can be done, by simple calculating the number of the pages of each Pdf input.

The following command takes an input file first_input.pdf and outputs the number of pages to stdout.

gs -dQUIET -dNODISPLAY --permit-file-read=first_input.pdf -c "(first_input.pdf) (r) file runpdfbegin pdfpagecount = quit"

Where -dQUIET removes any standard stdout comments, produced by ghostscript, -dNODISPLAY runs with null device, --permit-file-read=first_input.pdf gives ghostscript permission to read the input file, -c allows running PostScript code in commandline, instead of providing a file. runpdfbegin and pdfpagecount are PostScript procedures, provided by ghostscript (reference).

Another property we may need from input file is its title (for example to provide it as boookmark title). This can be done with following command:

gs -dBATCH -dQUIET -dPDFINFO -dNODISPLAY first_input.pdf 2>&1 | grep "Title: " | awk -F ': ' '{ print $NF }'

Once we know what will be the position of the input file, we can generate Pdf bookmarks. Plus, it would be nice to add new metadata to the generated file. An example PostScript code would look as below:

% Main file metadata
[ /Title (My Title for output pdf)
  /Subject (My Subject for output pdf)
  /Author (John Doe)
  /DOCINFO pdfmark

% Bookmark list
[
  /Page 2
  /Title (My Title for first input pdf)
  /OUT pdfmark
[
  /Page 3 % Value here is calculated from first_input.pdf pages count + 1
  /Title (My Title for second input pdf)
  /OUT pdfmark

We can provide this code either by -c option (as described earlier) or by providing a separate file. In this example we saved it as bookmark.ps.

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf bookmark.ps first_input.pdf second_input.pdf

Page numbers

The last thing (and the most complex) is to add page numbers to each page of the output edf. Preferably, page number should be added to the right bottom of the page. The example PostScript code would look like:

% Set PageCount
globaldict /PageCount 1 put
<<
  % A procedure to be executed at the end of each page
  /EndPage {
    % Run on showpage or copypage phase
    exch pop 0 eq dup {
      % Prepare PageCount
      /Helvetica 12 selectfont PageCount =string cvs

      % Read string size and put width to the stack
      dup stringwidth pop

      % Read device size
      currentpagedevice /PageSize get 0 get

      % Place in bottom right
      exch sub 60 sub 30

      % Draw
      moveto show

      % Increment PageCount
      globaldict /PageCount PageCount 1 add put
    } if
  } bind
>>
setpagedevice

If we save the above code as pages.ps. The full command would look like this:

gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf bookmark.ps pages.ps first_input.pdf second_input.pdf

Full PostScript reference can be found here.

Wrapping up

Taking all of the solutions above, we can create a simple bash script that will take Pdf files as input and output the merged Pdf.

#!/bin/bash

# Usage
# gs/merge.sh *.pdf

# Setup
OUTPUT_FILE=output.pdf
CURRENT_DIR=$(dirname "$0")
PAGE_NUMBER=1
BOOKMARKS="[
  /Title (Output)
  /Subject (Subject)
  /Author (John Doe)
  /DOCINFO pdfmark
"

# Loop through all files
for FILE in "$@"; do
  # Count pages
  PAGES_COUNT=$(gs -dQUIET -dNODISPLAY --permit-file-read=$FILE -c "($FILE) (r) file runpdfbegin pdfpagecount = quit")
  TITLE=$(gs -dBATCH -dQUIET -dPDFINFO -dNODISPLAY $FILE 2>&1 | grep "Title: " | awk -F ': ' '{ print $NF }')

  # Assign file path if Title is empty
  TITLE=${TITLE:-$FILE}

  # Create bookmarks
  BOOKMARKS="$BOOKMARKS [
    /Page $PAGE_NUMBER
    /Title ($TITLE)
    /OUT pdfmark
  "

  # Incrment page number
  PAGE_NUMBER=$(($PAGE_NUMBER + $PAGES_COUNT))
done

# Run main command
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=$OUTPUT_FILE "$CURRENT_DIR/pages.ps" $@ -c "$BOOKMARKS"

Arch Linux setup in ansible

The article describes ansible code, which can be found here

When searching for a reliable and fast system for everyday use, Arch Linux seems to be the best solution for me. It allows installing only the needed components, and its rolling release model brings updates almost instantly.

However, as always, there is a cost. Since Arch does not make any assumptions of what will be installed, almost every package needs to be configured manually. At one hand, it is extremely time-consuming and requires a lot of effort to just install the operating system. But on the other hand, we gain a knowledge about the system, its internals, and we are certain that we have absolute minimum on our local computer.

The other issue, that can arise, is how to store the knowledge. When, for example, some configuration was changed or some package needed to be installed. Obviously, for plain installation, the Arch Wiki can be followed and for user configuration - our dotfiles can be copied. If additional work is needed, just any other manual steps can be performed, like installing a package or changing a line of a file.

The above solution could work for most cases, but for me, it was a little odd, to keep knowledge in multiple places and remember what manual steps needed to be performed. Initially I tried to write a bash script, but I thought that there should be a better solution, and maybe I do not need to reinvent the wheel.

During some research, I found Ansible, which seemed to be a better suited tool. Internally, it uses python and provides some kind of idempotency to the run command. It provides a nice messaging system (for example when command fails or succeed) and is also very well suited for performing server changes or deploys.

After moving all the commands to ansible setup, the process of recreating personal enviroment is very simple. As on the diagram below:

  +----------------------------+       +-----------------------------+
  |  Laptop with ansible code  |  ssh  |  New laptop with empty disk |
  |     (ansible client)       |------>|       (ansible server)      |
  +----------------------------+       +-----------------------------+

Manual changes

Although most of the process is automated, in order to establish a ssh connection, some initial manual changes on both ansible server and ansible client sides are needed.

Ansible server

  1. Run Arch Linux live iso.
  2. Sign in as root and connect to local LAN (here a wifi example is presented):
    • Check the name of the wifi interface

       ip link
      
    • Create wpa_supplicant configuration file (assuming the previous command returned wlan0 as interface name).

        # cat /etc/wpa_supplicant/wpa_supplicant-wlan0.conf
      
        ctrl_interface=/var/run/wpa_supplicant
        ctrl_interface_group=wheel
        update_config=1
        network={
         ssid="my_ssid"
         psk="my_password"
        }
      
    • Restart wpa_supplicant service

        systemctl restart wpa_supplicant@wlan0
      
  3. Note the local ip address to be able to verify the connection on ansible client.

     ip addr show wlan0
    

Ansible client

  1. Confirm there is ssh connection with ansible server (assuming the previous command returned ansible_server_ip)

     ssh root@ansible_server_ip
    
  2. Copy the group variables configuration file (assuming our host is called ansible_server)

     cp group_vars/all group_vars/ansible_server
    
  3. Adjust the configuration as needed.

  4. Change ansible_server_ip in hosts file

     # cat hosts
    
     [ansible_server]
     ansible_server_ip
    

Run ansible

  1. Perform initial commands that partition disk and install bootable base.

     ansible-playbook -k site.yml -t disk,boot -l ansible_server
    
  2. Remove livecd. Restart ansible server and wait for boot.

  3. Run any other commands that are needed.

     ansible-playbook -k site.yml -t system,user,ag,git,vim -l ansible_server
    

System should be now usable. We can install any additional packages we need on the server (see roles directory).

Conclusion

Although configuration can be kept just as dotfiles and in form of personal notes, I found myself much more confident, if I can recreate my local setup very quick, without need of writing notes (as notes becomes de-facto ansible code) and (what is even worse) not to solve the same problem multiple times.