Automated download of Sony Digital Paper notes on Linux

I kept up my use of a laboratory notebook (Vela’s are my favorite) for a long time, but archiving them involved either keeping 50 pounds of paper notebooks or sawing off the bindings and scanning them. I made a couple of attempts to find an acceptable digital notetaking solution, but found that the other options I tried, the Apple iPad Pro + Apple Pencil and Surface Pro, were unpleasant to write on. Enter the Sony Digital Paper.

The Sony Digital Paper tablets are something I would have killed for back when I was in school. Besides general note-taking, they’re useful for reading technical books and the occasional research paper. The first model, DPT-S1, played well with Linux and was supported by Calibre after v2.33, but cost $1,100 USD at launch. It also supported WebDAV, had a built-in browser, and provided a filesystem accessible via USB. The successor model, DPT-RP1, improved the hardware but crippled the software. Now the only mechanism to transfer files was via Sony’s Digital Paper app running on a Windows or MacOS computer, full stop. I worked around this by running the DPA in a Windows 10 VM, which allowed me to add books from my library to the device, but proved impractical for automatically keeping my notes in sync.

Jan-Gerd Tenberge’s excellent dpt-rp1-py library provided a solution. The rest of this post describes the steps and code I used to set up automatic downloading of my notes. (These files are available in this gist.)

Installation and registration

The dpt-rp1-py library provides a command line utility. We begin by using it to register with the device and obtain a private key:

$ dptrp1 --client-id .dpt-client --key .dpt.key --addr 192.168.1.101 register

Which places our client id and key at ~/.dpt-client and ~/.dpt.key, respectively.

Initially I tried using --addr digitalpaper.local, but failed to resolve the domain. We can use Avahi for local hostname resolution:

$ avahi-resolve -n digitalpaper.local
Failed to create client object: Daemon not running

Which we fix by enabling the daemon:

$ sudo systemctl enable avahi-daemon.service
$ sudo systemctl start avahi-daemon.service

Rerunning the command results in the correct output:

$ avahi-resolve -n digitalpaper.local
digitalpaper.local	192.168.1.101

Next we download and upload a file. We begin by describing the available commands:

$ dptrp1 --help
usage: dptrp1 [-h] --client-id CLIENT_ID --key KEY [--addr ADDR]
              {delete,download,list-documents,new-folder,register,screenshot,upload,wifi,wifi-add,wifi-del,wifi-disable,wifi-enable,wifi-list,wifi-scan}
              [command_args [command_args ...]]

Remote control for Sony DPT-RP1

positional arguments:
  {delete,download,list-documents,new-folder,register,screenshot,upload,wifi,wifi-add,wifi-del,wifi-disable,wifi-enable,wifi-list,wifi-scan}
                        Command to run
  command_args          Arguments for the command

optional arguments:
  -h, --help            show this help message and exit
  --client-id CLIENT_ID
                        File containing the device's client id
  --key KEY             File containing the device's private key
  --addr ADDR           Hostname or IP address of the device

So let’s begin by seeing what documents we can list:

$ dptrp1 --client-id ~/.dpt-client --key ~/.dpt.key --addr 192.168.1.101 list-documents
Document/Book Dump/Software Engineering/Learning Python Design Patterns - Gennadiy Zlobin.pdf
Document/Note/Career.pdf
Document/Note/Project_20180526.pdf
Document/Note/Project_20180515.pdf
Document/Note/Tooling.pdf
Document/Note/Zebra Daily Schedule_20180410.pdf
Document/Note/Meeting Notes_20180425.pdf
Document/Note/Zebra Daily Schedule_20180420.pdf
Document/Note/Zebra Daily Schedule_20180424.pdf
Document/Note/Notepad_20180424.pdf
Document/Note/Personal_Projects.pdf
Document/nyt-italic-worksheet.pdf
Document/Note/Lab_20180329.pdf
Document/Note/JS XML Parser.pdf
Document/Book Dump/Software Engineering/Node.js Design Patterns - Mario Casciaro.pdf
Document/Book Dump/Software Engineering/Node.js Design Patterns - Mario Casciaro_Note.pdf
[...]

Anyhow, let’s download the file we care about, my job search notes:

$ dptrp1 --client-id ~/.dpt-client --key ~/.dpt.key --addr 192.168.1.101 \
 download Document/Note/Career.pdf ~/org/pdf/Career.pdf

Which works. Now, let’s upload a book we want to read, Fluent Python:

$ dptrp1 --client-id ~/.dpt-client --key ~/.dpt.key --addr 192.168.1.101 \
  upload '/home/je/Dropbox/Books/Fluent Python - Luciano Ramalho.pdf' 'Document/Fluent Python.pdf'

Which also succeeds.

Automation

There are two steps: creating the script to handle downloading the files we want, and using systemd to periodically run that script (although cron would also work).

Python Script

The DPT-RP1 responds to list-documents with an array of JSON objects:

{
  "author": "JME",
  "created_date": "2018-05-27T01:34:01Z",
  "current_page": "1",
  "document_type": "note",
  "entry_id": "af605956-85cd-4f0d-8471-81f694065cb4",
  "entry_name": "Project_20180526.pdf",
  "entry_path": "Document/Note/Project_20180526.pdf",
  "entry_type": "document",
  "file_revision": "39c6e6c926ab.1.0",
  "file_size": "126575",
  "is_new": "false",
  "mime_type": "application/pdf",
  "modified_date": "2018-05-29T15:47:36Z",
  "parent_folder_id": "c62a6d02-0b11-48e0-ab91-8d6a5df8583e",
  "reading_date": "2018-05-29T15:47:35Z",
  "title": "Project",
  "total_page": "1"
}

Here I include a script to handle downloading my notes:

#!/usr/bin/env python3
"""
usage: dpt-notes-sync.py ip_address
"""

import json
import os

from dptrp1.dptrp1 import DigitalPaper

SYNC_DIR = '/home/je/org/pdf/'

def connect(address):
    """
    Loads the key and client ID to authenticate with the DPT-RP1
    """
    with open('/home/je/.dpt-client', 'r') as f:
        client_id = f.readline().strip()

    with open('/home/je/.dpt.key', 'r') as f:
        key = f.read()

    dpt = DigitalPaper(address)
    dpt.authenticate(client_id, key)
    return dpt

def download_notes(dpt):
    """
    Given an authenticated DigitalPaper instance, download all note files to a
    specified directory.
    """
    for doc in [f for f in dpt.list_documents() if is_modified_note(f)]:
        data = dpt.download(doc['entry_path'])
        local_path = SYNC_DIR + os.path.basename(doc['entry_path'])
        with open(local_path, 'wb') as f:
            f.write(data)
        print('Saved {} to {}'.format(doc['entry_path'], local_path))


def is_modified_note(doc):
    if doc['document_type'] == 'note':
        local_path = SYNC_DIR + os.path.basename(doc['entry_path'])
        if not os.path.exists(local_path):
            return True
        else:
            return os.path.getmtime(local_path) < dateparser.parse(
                doc['modified_date']).timestamp()


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('address', help="IP address of the DPT-RP1")
    args = parser.parse_args()

    try:
        dpt = connect(args.address)
        download_notes(dpt)
    except OSError:
        print('Unable to reach device, verify it is connected to the same network segment.')
        exit(1)

Output:

Saved Document/Note/Career.pdf to /home/je/org/pdf/Career.pdf
Saved Document/Note/Notepad_20180607.pdf to /home/je/org/pdf/Notepad_20180607.pdf
Saved Document/Note/Project_20180526.pdf to /home/je/org/pdf/Project_20180526.pdf
[...]
Saved Document/Book Dump/Software Engineering/Node.js Design Patterns - Mario Casciaro_Note.pdf to /home/je/org/pdf/Node.js Design Patterns - Mario Casciaro_Note.pdf
[...]

In is_modified_note() we verify that the document is a note, and that it has been updated since the local version was last modified, if one exists.

We satisfy the first condition by checking the value of document_type. This is because notes created via Create Note are stored in Document/Note, while side-by-side documents are stored alongside their respective files. Luckily, filtering files based on document_type will also pull in any side-by-side notes created for any book. This is less brittle than parsing paths and using doc['entry_path'].endswith('_Note.pdf').

Systemd

I initially used a bash wrapper to avoid having to deal with resolving digitalpaper.local within the python script.

dpt-notes-sync.sh:

#!/usr/bin/env bash

DPT_IP=`avahi-resolve-host-name digitalpaper.local | cut -f 2`
./home/je/bin/scripts/dpt-notes-sync.py $DPT_IP

I then created a systemd user unit named dptsync.service:

[Unit]
Description=Sony DPT-RP1 note synchronization service
After=network.target network-online.target dbus.socket

[Service]
Type=oneshot
ExecStart=-/home/je/bin/scripts/dpt-notes-sync.sh

After a bit of googling I also found this gist that shows how to do this without the bash wrapper using systemctl set-environment. We then modify dptsync.service as follows:

[Unit]
Description=Sony DPT-RP1 note synchronization service
After=network.target network-online.target dbus.socket

[Service]
Type=oneshot
ExecStartPre=/usr/bin/bash -c "/usr/bin/systemctl --user set-environment DPT_IP=$(avahi-resolve-host-name digitalpaper.local | cut -f 2)"
ExecStart=-/home/je/bin/scripts/dpt-notes-sync.py ${DPT_IP}

Using the =- syntax tells systemd to ignore errors for this command. We do this because the DPT-RP1 may not be on the network or in sleep mode.

And a systemd timer file, dptsync.timer:

[Unit]
Description=Sony DPT-RP1 note synchronization service

[Timer]
OnBootSec=2m
OnUnitActiveSec=2h
Unit=dptsync.service

[Install]
WantedBy=timers.target

This runs our synchronization script two minutes after logging in, and every two hours thereafter. We then place the dptysync.service and dptysync.timer in ~/.config/systemd/user and enable them:

systemctl --user enable dptsync.timer
systemctl --user start dptsync.timer

We can read the journal to see the sync results:

journalctl --user-unit dptsync.service
-- Logs begin at Thu 2018-04-12 20:34:11 PDT, end at Fri 2018-06-08 12:10:13 PDT. --
Jun 08 12:07:49 aponte systemd[1346]: Started Sony DPT-RP1 note synchronization service.
Jun 08 12:10:03 aponte systemd[1346]: Starting Sony DPT-RP1 note synchronization service...
Jun 08 12:10:11 aponte dpt-notes-sync.py[27535]: Saved Document/Note/Career.pdf to /home/je/org/pdf/Career.pdf
Jun 08 12:10:11 aponte dpt-notes-sync.py[27535]: Saved Document/Note/Notepad_20180607.pdf to /home/je/org/pdf/Notepad_20180607.pdf
[...]

And failed syncs are logged as follows:

Jun 08 12:07:44 aponte systemd[1346]: Starting Sony DPT-RP1 note synchronization service...
Jun 08 12:07:49 aponte bash[26956]: Failed to resolve host name 'digitalpaper.local': Timeout reached
Jun 08 12:07:49 aponte dpt-notes-sync.py[26967]: Unable to reach device, verify it is connected to the same network segment.

Next steps

  • Uploads: The inverse is also possible: periodically uploading documents placed in a specific directory to the DPT-RP1.
  • Synchronization: The logical next step would be to do two way synchronization based on the value of modified_date. However, that isn’t my primary use case, archiving my project and meeting notes is.
  • Calibre Plugin: After synchronization, writing a plugin adding a DPT-RP1 device driver to Calibre would be the most useful to my fellow DPT owners. Unfortunately, this is a non-trivial effort, or someone else would have written one by now. I’d like to take a look, but I’ve got some other projects in my queue to do first.