Installation¶
Install requirements¶
To use some of the included plugins, you might want to install the following dependencies:
- chdkptp to use cameras with the CHDK firmware (installed in /usr/local/lib/chdkptp)
- An up-to date version of ScanTailor-enhanced
- pdfbeads
- djvubind
- PySide (available as python-pyside for Debian and Ubuntu)
- libgphoto2
Installing the core from PyPi¶
This will grab the latest release and install all Python dependencies:
$ sudo pip install spreads
Installing plugin dependencies¶
This will grab all Python dependencies for the selected plugins:
$ sudo pip install spreads[chdkcamera,web,hidtrigger]
Adjust the list of plugins as needed.
Installing a nightly build¶
Like from PyPi, only using the latest development version (might break, use with caution!):
$ sudo pip install http://buildbot.diybookscanner.org/nightly/spreads-latest.tar.gz
Configuration¶
Initial configuration¶
To perform the initial configuration, launch the either the configure subcommand or its graphical counterpart, guiconfigure:
$ spread configure
# or
$ spread guiconfigure
The following instructions are mostly target at users of the CLI configuration interface, but all of the available settings are also equally available from the GUI and should be pretty self-explanatory.
You will be asked to select a device driver and some plugins. Next, configure the order in which your postprocessing plugins should be invoked. Think of it as a pipelining system, where each of the plugin gets fed the output of its predecessor.
Next, if you are using two cameras for scanning, your can the target pages for each of your cameras. This is necessary, as the application has to:
- combine the images from both cameras to a single directory in the right order
- set the correct rotation for the captured images
To do both of these things automatically, the application needs to know if the image is showing an odd or even page. Don’t worry, you only have to perform this step once, the orientation is stored on the camera’s memory card (under A/OWN.TXT). Should you later wish to briefly flip the target pages, you can do so via the –flip-target-pages command-line flag.
Note
If you are using a DIYBookScanner and the book is facing you, the device for odd pages is the camera on the left, the one for even pages on the right.
After that, you can choose to setup the focus for your devices. By default, the focus will be automatically detected on each shot. But this can lead to problems: Since the camera uses the center of the frame to obtain its focus, your images will be out of focus in cases where the center of the page does not have any text on it, e.g. in chapter endings. This step is therefore recommended for most users. Before you continue, make sure that you have loaded a book into the scanner, and that the pages facing the camera are evenly filled with text or illustrations.
Once you’re done, you can find the configuration file in the .config/spreads folder in your home directory.
Configuration file¶
spreads writes its configuration file to ~/.config/spreads/config.yaml. In it, you can change all of the available settings to your liking. The configuration options are the same ones that you can set on the command-line, so just call spreads <command> –help to view the documentation. Command-line flags that begin with –no-... should be entered without the no prefix and have yes or no as their value.
Here is an example that demonstrates the general layout:
# Names of activated plugins, postprocessing plugins will be called
# in the order that they are entered here
plugins: [gui, autorotate, scantailor]
# Name of the device driver
driver: chdkcamera
core:
# Enable verbose output on command-line
verbose: no
# Keys that trigger a capture in command-line interface
capture_keys: [' ', b]
# Path to logfile
logfile: ~/.config/spreads/spreads.log
# Loglevel for logfile
loglevel: info
# Device settings
device:
parallel_capture: yes
flip_target_pages: no
# Plugin settings
tesseract:
language: deu-frak
scantailor:
autopilot: no
SpreadPi Setup¶
Materials needed:
- Raspberry Pi (Model B+ recommended)
- network cable
- Class10 SD Card (lower clases will slow down operations significantly). See this list for SD cards known to work well with the Raspberry Pi.
- free ethernet port in your router/switch
- Download the latest version of the SpreadPi disk image of SpreadPi from the buildbot. It contains a fully configured Linux operating system and a complete installation of Spreads, ready to run.
- Extract the image with 7-Zip and follow the tutorial matching your operating system to copy SpreadPi to the SD-Card that goes into the Raspberry Pi: Windows / OS X / Linux.
Note
For most situations, this is all you need to configure the Pi. For advanced users and occasional problematic setups, it is possible to SSH into the Pi and configure it manually. You have to use the following credentials:
- Username:
- spreads
- Password
- spreads
- Root-Password:
- raspberry
- Now that the Pi has an operating system, we need to configure our devices. SpreadPi currently assumes that the user is running CHDK devices, so check the driver documentation for how to correctly set up the cameras.
- Connect the network cable to the Pi and your router or switch. Connect all devices. Turn on the devices first, and only then turn on the Pi. The Pi takes a few minutes to boot for the first time - be patient. It will reboot once to resize the image to fit the whole SD-Card. Spreads is getting an IP address from your network and will display that IP address on the screens of your cameras for you when it is ready to begin.
- Spreads has an easy-to-use web interface. Open a browser on any device that is on the same network as your scanner. If your smartphone or tablet is on your home WiFi network, you can use it to the scanner. To connect to it, enter the IP address that was displayed on the camera screen. Refer to the web plugin documentation for more information on how to use the interface.
Web Interface¶
Installation¶
To install the required dependencies for the web plugin, run the following command:
$ pip install spreads[web]
Alternatively, make sure you have the following modules installed in their most recent versions:
- Flask
- Flask-Compress
- jpegtran-cffi
- requests
- waitress
- zipstream
To use the JavaScript web interface, make sure you use a recent version of Firefox or Chrome.
Startup and Configuration¶
You can launch the web interface with its subcommand:
$ spread web [OPTIONS]
This will serve the spreads web interface and its RESTish-API for the whole network. There are a number of options available:
-
--port
<int>
¶ Port that the web application is listening on. By default this is 5000
-
--mode
<full/scanner/processor>
¶ Mode to run the web plugin in. scanner only exposes functionality that is needed for scanning, while processor only exposes functionality that is needed for postprocessing and output generation. full exposes all available functionality. Instances of spreads running in scanner mode can transfer their workflows to other instances on the network that run in processor mode and let them take care of the postprocessing and output generation.
-
--postprocessing-server
<address>
¶ Select a default postprocesisng server to user. This is only useful if the web plugin is running in scanner mode and the user is planning to transfer workflows to another spreads instance on the network (see above). This configures a default address for such a server that is always shown.
-
--standalone-device
¶
Enable standalone mode. This option can be used for devices that are dedicated to scanning (e.g. a RaspberryPi that runs spreads and nothing else). At the moment the only additional features it enables is the ability to shutdown and reboot the device from the web interface and REST API.
-
--debug
¶
Run the application in debugging mode. This activates source maps in the client-side code, which will increase the initial loading time significantly.
-
--project-dir
<path>
¶ Location where workflow files are stored. By default this is ~/scans.
Interface¶
You can connect to the interface by opening your browser on an address that looks like this:
http://<host-ip-address>:<web-port>
If you are running spreads in your local machine, using localhost or 127.0.0.1 for the IP address will be enough. If you are running it on a remote machine, you will have to find out its IP address. When you are using CHDK cameras and have them turned on when you launch spreads, their displays will show the IP address of the computer they are connected to. The web-port is by default configured to be 5000, though this can be configured.
The initial screen will list all previously created workflows with a small preview image and some information on their status. On clicking one of the workflows, you will be taken to its details page where you can view all of the images and see more information on it. You can also choose to download a ZIP or TAR file with the workflow, containing all images and a configuration file.
From the navigation bar, you can choose to create a new workflow. The only metadata you absolutely have to enter is the workflow name. Note that when you enter a name, you will be offered a selection of ISBN records that might match your title. If you select one of these, the rest of the fields will be filled out automatically.
You can also change driver and plugin settings for this workflow by selecting either one from the dropdown menu. For a reference on what the various options mean, please consult the documentation of the repsective plugin or driver. When you are done, you can submit the workflow and the application will take you to the capture screen.
On the capture screen, you can see two small review images with which you can verify that the last capture went well. Trigger a new capture by clicking the appropriate button and you will see the images update.
If you spotted an error, you can click the Retake button, which will discard the last capture and trigger a new one. Note that the new capture will be triggered immediately, there is no need to use the capture button. Once you are done, use the finish button.
Graphical Interface¶
Installation¶
To enable the GUI wizard, first make sure that you have an up-to date version of PySide installed on your machine.
Then, just re-run the configure step and add gui to your list of plugins.
Interface¶
On the first screen, you can adjust various settings for your scan. You have to specify a project directory before you can continue. The rest of the settings depends on which plugins you have enabled. Select the plugin to configure from the dropdown menu and make your adjustments.

Initial setup page
After you’ve clicked *next*, the cameras will be prepared for capture by setting their zoom and focus levels. At the top of the screen you can see how many pages you’ve already scanned, as well as your current average scanning speed. The text box at the bottom of the screen will display any warnings or error messages that occur during the capture process. Next, initiate a capture by clicking on the button (or pressing one of the capture keys).

Capture page
Once you have captured your first pages, you will see the last two pages your cameras shot. Here you can verify that everything went as expected. Should you notice a mistake, you can discard the previous shot and retake it by clicking on the retake button.

Capture page with control images
Once you’ve finished scanning your book and clicked on the *next* button, spreads will execute all enabled postprocessing plugins in the sequence that you configured. You can verify the progress in the text box.

Postprocessing page
Last, spreads will assemble the processed scans into your enabled output formats. As in the postprocessing step, follow the progress via the text box.

Output page
Command-Line Interface¶
Startup and Configuration¶
$ spread wizard <project-path>
Start spreads in wizard mode. This will go through all of the steps outlined below and store images and output files in project-path. The command-line flags are the same as for the capture, process and output commands.
$ spread capture [OPTIONS] <project-directory>
This command will start a capturing workflow. Make sure that your devices are turned on. After the application is done setting them up, you will enter a loop, where all devices will trigger simultaneously (if not configured otherwise, see below) when you press one of the capture keys (by default: the b or spacebar key). Press r to discard the last capture and retake it. Press f to finish the capture process.
-
--no-parallel-capture
¶
When using two devices, do not trigger them simultaneously but one after the other.
-
--flip-target-pages
¶
When using two devices, flip the configured target pages, i.e. the camera configured to be odd will temporarily be the even device and vice versa. This can be useful when you are scanning e.g. East-Asian literature.
$ spread postprocess <project-directory>
Start the postprocessing workflow by calling each of the postprocessing plugins defined in the configuration one after the other.The transformed images will be stored in project-directory/done.
$ spread output <project-directory>
Start the output workflow, calling each of the output plugins defined in the configuration. All output files will be stored in project-directory/out.
Frequently Asked Questions¶
CHDK Cameras¶
... When capturing, the commands frequently time out.
This is a known issue when both cameras are connected to the same USB hub. It seems to occur less frequently with powered USB hubs, but the safest way to avoid these hickups is to connect each device to a separate USB hub/port. You might also want to try another USB cable.
... USBError: [Errno 13] Access denied (insufficient permissions)
This means that your user is not allowed to write to the camera devices. To temporarily fix this, run$ sudo chmod -R a+rw /dev/bus/usb/*
. To permanently fix the permissions, create a new udev rule that sets the permissions when the devices are plugged in.
... [Error: :80: attempt to call global 'get_gui_screen_width' (a nil value)]
spreads requires CHDK version 1.3.0 or later; you probably have the stable branch v1.2.0 installed on your camera.
Device Drivers¶
In order for your capture device to work with spreads, you need to tell the application which driver it is supposed to use. This can be either done by running the configure subcommand and selecting one from the provided list or by manually editing the configuration file in .config/spreads/config.yaml in your home directory.
Currently, the following drivers are available:
chdkcamera¶
This driver should work with any Canon camera that runs the custom CHDK firmware in version 1.3 or higher.
For it to work, the chdkptp application must be installed in /usr/local/lib/chdkptp (though that path can be configured, see below). You also need to install the pyusb package, with either of the following two commands:
$ pip install spreads[chdkcamera]
$ pip install pyusb
The following cameras have been tested and confirmed to work:
- A2200
- A810
- A410
If you own another CHDK-supported camera and have problems getting it to run with this driver, please open an issue on GitHub, we would love to make it work.
The following configuration keys/command-line flags are available:
-
--sensitivity
<int>
¶ The ISO sensitivity value as a whole number. Default is 80.
-
--shutter-speed
<fraction>
¶ The desired shutter speed as a fractional value. Default is 1/25. The equivalent key in the configuration file is shutter_speed.
-
--zoom-level
<int>
¶ The desired zoom-level as a whole number. Default is 3. Make sure that this value is supported by your camera, or else you will get an error. The equivalent key in the configuration file is zoom_level.
-
--dpi
<int>
¶ The resolution in dots per inch that the camera captures at the given zoom level. Default is 300. You can determine this value yourself by taking a picture of an object with known dimensions, measuring its size in pixels and calculate the dots per inch from that.
-
--shoot-raw
¶
Shoot RAW images instead of JPEG. Please note that this setting is highly experimental at the moment and RAW files are not supported by the postprocessing and output plugins as of now. The equivalent key in the configuration file is shoot_raw.
-
--focus-distance
<int/auto>
¶ This option allows the user to set a fixed focus distance for the cameras by specifying a whole number. This value can be obtained and automatically set in the configuration fileby running the configure command and following the instructions. By default, this value is set to auto, which means that the camera will automatically re-focus for each capture, which might give problems when there is no text or images in the center of the image. The equivalent key in the configuration file is focus_distance
-
--chdkptp-path
<path>
¶ Specify where the application can locate the chdkptp files. By default this is /usr/local/lib/chdkptp.
gphoto2camera¶
This driver works with many PTP compatible camera. The full list of compatible cameras can be found here: http://www.gphoto.org/doc/remote/
For it to work, the following must be installed:
libgphoto2: http://www.gphoto.org/ This provides the low-level PTP interface.
You can either build from source (http://sourceforge.net/projects/gphoto/files/) or install via your local package manager (apt, brew, etc).
- For example, on Mac OS X with brew installed:
$ brew install gphoto2 libgphoto2
piggyphoto: https://github.com/YesVideo/piggyphoto This is the python interface to libgphoto2. The original source is https://github.com/alexdu/piggyphoto (our pull request to merge is pending).
- The easiest way to install is:
$ pip install -e git://github.com/YesVideo/piggyphoto#egg=piggyphoto
The following cameras have been tested and confirmed to work:
- Canon T2i
- Canon 5D mk2
If you own another libgphoto2-supported camera and have problems getting it to run with this driver, please open an issue on GitHub, we would love to make it work.
The following configuration keys/command-line flags are available:
-
--iso
<string>
¶ The ISO value. Default is ‘Auto’.
-
--shutter-speed
<fraction>
¶ The desired shutter speed as a fractional value. Default is 1/25. The equivalent key in the configuration file is shutter_speed.
-
--aperture
<float>
¶ The desired aperture expressed as an f-stop (without the ‘f/’ prefix). Default is 5.6. The equivalent key in the configuration file is aperture.
-
--shoot-raw
¶
Shoot RAW images instead of JPEG. Please note that this setting is highly experimental at the moment and RAW files are not supported by the postprocessing and output plugins as of now. The equivalent key in the configuration file is shoot_raw.
Plugins¶
spreads comes with a variety of plugins pre-installed. Plugins perform their actions at several designated points in the workflow. They can also add specify options that can be set from one of the interfaces.
subcommand plugins¶
These plugins add additional commands to the spread application. This way, plugins can implement additional workflow steps or provide alternative interfaces for the application.
gui¶
Launches a graphical interface to the workflow. The steps are the same as with the CLI wizard, additionally a small thumbnail of every captured image is shown during the capture process. Requires an installation of the PySide packages. Refer to the GUI tutorial for more information.
web¶
Launches the spread web interface that offers a REST-ish API with which you can control the application from any HTTP client. It also includes a client-side JavaScript application that can be used from any recent browser (Firefox or Chrome recommended). Fore more details, consult the Web interface documentation <web_doc> and the REST API documentation <rest_api>
-
--standalone-device
¶
Enable standalone mode. This option can be used for devices that are dedicated to scanning (e.g. a RaspberryPi that runs spreads and nothing else). At the moment the only additional feature it enables is the ability to shutdown the device from the web interface and REST API.
-
--debug
¶
Run the application debugging mode.
-
--project-dir
<path>
¶ Location where workflow files are stored. By default this is ~/scans.
-
--mode
[scanner
,
processor
,
full
(default)]
¶ Select the mode the web plugin is supposed to run in. scanner: Only offer components neccessary for capture and download/submission to a postprocessing server processor: Start as a postprocessing server that can receive workflows over the network from other ‘scanner’ instances full: Combines the above two modes, allows for capture and postprocessing/output generation on the same machine
-
--port
<port> (default: 5000)
¶ Select port on which the web plugin is supposed to listen on
postprocess plugins¶
An extension to the postprocess command. Performs one or more actions that either modify the captured images or generate a different output.
autorotate¶
Automatically rotates the images according to their device of origin.
scantailor¶
Automatically generate a ScanTailor configuration file for your scanned book
and generate output images from it. After the configuration has been generated,
you can adjust it in the ScanTailor UI, that will be opened automatically,
unless you specified the auto
option. The generation of
the output images will run on all CPU cores in parallel.
-
--autopilot
¶
Run ScanTailor on on autopilot and do not require and user input during postprocessing. This skips the step where you can manually adjust the ScanTailor configuration.
-
--detection
<content/page> [default: content]
¶ By default, ScanTailor will use content boundaries to determine what to include in its output. With this option, you can tell it to use the page boundaries instead.
-
--no-content
¶
Disable content detection step.
-
--rotate
¶
Enable rotation step.
-
--no-deskew
¶
Do not deskew images.
-
--no-split-pages
¶
Do not split pages.
-
--no-auto-margins
¶
Disable automatically detect margins.
tesseract¶
Perform optical character recognition on the scanned pages, using the tesseract application, that has to be installed in order for the plugin to work. For every recognized page, a HTML document in hOCR format will be written to project-directory/done. These files can be used by the output plugins to include the recognized text.
-
--language
LANGUAGE
¶ Tell tesseract which language to use for OCR. You can get a list of all installed languages on your system by running spread capture –help.
output plugins¶
An extension to the out command. Generates one or more output files from the scanned and postprocessed images. Writes its output to project-directory/done.
pdfbeads¶
Generate a PDF file from the scanned and postprocessed images, using the pdfbeads tool. If OCR has been performed before, the PDF will include a hidden text layer with the recognized text.
djvubind¶
Generate a DJVU file from the scanned and postprocessed images, using the djvubind tool.
See also
Contributing¶
Extending spreads¶
Setting up a development environment¶
The easiest way to work on spreads is to install it to an editable
virtual Python environment using the virtualenv
tool and installing
spreads into it using pip
with the
-e
option. This option allows the virtual environment to treat a
spreads repository checked out from git as a live installation.
For example, on a Debian-based system, assuming the git repository
for spreads is checked out to ./spreads
:
virtualenv spreadsenv
cd spreadsenv
source ./bin/activate
# The following dependencies are not pulled in automatically by
# setuptools
pip install cffi
pip install jpegtran-cffi
pip install -e ../spreads
Other prerequisite packages you may require include:
libffi-dev libjpeg8-dev libturbojpeg
Adding support for new devices¶
To support new devices, you have to subclass DevicePlugin
in your module and add it as an entry point for
the spreadsplug.devices
namespace to your package’s setup.py
. In it,
you override and implement the features supported by your device. Take a look
at the plugin for CHDK-based cameras and the relevant part of spreads’
setup.py for a reference implementation.
Devices have to implement a yield_devices<spreads.plugin.DevicePlugin.yield_devices> method that scans the system for supported devices and returns fully instantiated device objects for those.
Declaring available configuration options for plugins¶
Device drivers (as well as all plugins) can implement the configuration_templates<spreads.plugin.SpreadsPlugin.configuration_template> method that returns a dictionary of setting keys and PluginOption<spreads.plugin.PluginOption> objects. These options will be visible across all supported interfaces and also be read from the configuration file and command-line arguments.
Extending spreads built-in commands¶
You can extend all of spread’s built-in commands with your own code. To do,
you just have to inherit from the HookPlugin
class and one of the available mixin classes (at
the moment these are CaptureHooksMixin<spreads.plugin.CaptureHooksMixin>,
TriggerHooksMixin<spreads.plugin.TriggerHooksMixin>,
ProcessHookMixin<spreads.plugin.ProcessHookMixin>,
OutputHookMixin<spreads.plugin.OutputHookMixin>). You then have to implement
each of the required methods for the mixins of your choice.
Furthermore, you have to add an entry point for that class in the
spreadsplug.hooks
namespace in your package’s setup.py
file. For a
list of available hooks and their options, refer to the API documentation. Example implementations can be found on GitHub
See also
module spreads.plugin
, module spreads.util
Adding new commands¶
You can also add entirely new commands to the application. Simply subclass
HookPlugin
and
SubcommandHookMixin<spreads.plugin.SubcommandHookMixin>, implement the
add_command_parser
classmethod and add your new class as an entry point to
the spreadsplug.hooks
namespace. See the web and gui plugins for examples
of plugins that add custom subcommands.
API Reference¶
spreads API Reference¶
spreads package
This is the core package for spreads. Except for the spreads.cli
and
spreads.main
modules (which contain the logic for the spread
command-line application) everything in this package is UI-agnostic and
designed to be used from plugins in the spreadsplug namespace.
It includes the following modules (in no particular order):
spreads.main
- Core logic for application startup and parsing of command-line arguments
spreads.cli
- Implementation of the command-line interface, i.e. the configure, capture, postprocess, output and wizard subcommands.
spreads.config
- Classes for working with configuration, both per-workflow and
application-wide. Most important for plugin developers is the
spreads.config.OptionTemplate
class, which allows for the UI-agnostic declaration of configuration options. spreads.workflow
- This is by far the largest module in the core and contains the
spreads.workflow.Workflow
class that is the central entity in the application. Also included are classes for representing single page entities and TOC-entries, as well as various signals that can be emitted by a workflow entity. spreads.metadata
- Contains the
spreads.metadata.Metadata
entity class that manages the reading and writing of metadata values. spreads.plugin
- The most important module for plugin authors. It contains the various
interfaces (all inheriting from
spreads.plugin.SpreadsPlugin
) that plugins and device drivers can implement, as well as functions (intended for use by the core) to enumerate and initialize plugins and device drivers. spreads.util
- Various helper functions that can be useful for both plugin authors and
the core. Also contains the various
Exception
subclasses used throughout the core and the plugin interface. spreads.tkconfigure
- Implementation of the graphical configuration dialog (accessible via the guiconfigure subcommand), using the Tkinter bindings from Python’s standard library.
Public plugin API (realized through a range of abstract classes) and utility functions for enumerating and loading plugins.
-
exception
spreads.plugin.
ExtensionException
(message=None, extension=None)[source]¶ ” Raised when something went wrong during plugin enumeration/ or instantiation.
-
class
spreads.plugin.
SpreadsPlugin
(config)[source]¶ Plugin base class.
-
on_progressed
= <blinker.base.NamedSignal object at 0x7fc9f07e01d0; u'plugin:progressed'>¶
-
classmethod
configuration_template
()[source]¶ Allows a plugin to define its configuration keys.
The returned dictionary has to be flat (i.e. no nested dicts) and contain a OptionTemplate object for each key.
Example:
{ 'a_setting': OptionTemplate(value='default_value'), 'another_setting': OptionTemplate(value=[1, 2, 3], docstring="A list of things"), # In this case, 'full-fat' would be the default value 'milk': OptionTemplate(value=('full-fat', 'skim'), docstring="Type of milk", selectable=True), }
Returns: dict with unicode -> spreads.config.OptionTemplate
-
-
class
spreads.plugin.
DeviceFeatures
[source]¶ Enum that provides various constants that
DeviceDriver
implementations can expose in theirDeviceDriver.features
tuple to declare support for one or more given features.-
PREVIEW
= <DeviceFeatures.PREVIEW: 1>¶ Device can grab a preview picture
-
IS_CAMERA
= <DeviceFeatures.IS_CAMERA: 2>¶ Device class allows the operation of two devices simultaneously (mainly to be used by cameras, where each device is responsible for capturing a single page.
-
CAN_DISPLAY_TEXT
= <DeviceFeatures.CAN_DISPLAY_TEXT: 3>¶ Device can display arbitrary messages on its screen
-
CAN_ADJUST_FOCUS
= <DeviceFeatures.CAN_ADJUST_FOCUS: 4>¶ Device can read set its own focus distance and read out its autofocus
-
-
class
spreads.plugin.
DeviceDriver
(config, device)[source]¶ Base class for device drivers.
Subclass to implement support for different devices.
-
features
= ()¶ Tuple of
DeviceFeatures
constants that designate the features the device offers.
-
classmethod
configuration_template
()[source]¶ Returns some pre-defined options when the implementing devices has the
DeviceFeatures.IS_CAMERA
feature.
-
__init__
(config, device)[source]¶ Set connection information and other properties.
Parameters: - config (
spreads.confit.ConfigView
) – spreads configuration - device (py:class:usb.core.Device) – USB device to use for the object
- config (
-
set_target_page
(target_page)[source]¶ Set the device target page, if applicable.
Parameters: target_page (unicode, one of odd or even) – The target page
-
prepare_capture
()[source]¶ Prepare device for scanning.
What this means exactly is up to the implementation and the type of device, usually it involves things like switching into record mode and applying all relevant settings.
-
capture
(path)[source]¶ Capture a single image with the device.
Parameters: path ( pathlib.Path
) – Path for the image
-
finish_capture
()[source]¶ Tell device to finish capturing.
What this means exactly is up to the implementation and the type of device, with a camera it could e.g. involve retracting the lense.
-
update_configuration
(updated)[source]¶ Update the device configuration.
The implementing device driver should propagate these updates to the hardware and make sure everything is applied correctly.
Parameters: updated (dict) – Updated configuration values
-
on_progressed
= <blinker.base.NamedSignal object at 0x7fc9f07e01d0; u'plugin:progressed'>¶
-
-
class
spreads.plugin.
HookPlugin
(config)[source]¶ Base class for HookPlugins.
Implement one of the available mixin classes (
SubcommandHooksMixin
,CaptureHooksMixin
, py:class:TriggerHooksMixin,ProcessHooksMixin
,OutputHooksMixin
) to register for the appropriate hooks.-
__init__
(config)¶ Initialize the plugin.
Parameters: config ( confit.ConfigView
) – The global configuration object. If the plugin has a __name__ attribute, only the section with plugin-specific values gets stored in the config attribute
-
configuration_template
()¶ Allows a plugin to define its configuration keys.
The returned dictionary has to be flat (i.e. no nested dicts) and contain a OptionTemplate object for each key.
Example:
{ 'a_setting': OptionTemplate(value='default_value'), 'another_setting': OptionTemplate(value=[1, 2, 3], docstring="A list of things"), # In this case, 'full-fat' would be the default value 'milk': OptionTemplate(value=('full-fat', 'skim'), docstring="Type of milk", selectable=True), }
Returns: dict with unicode -> spreads.config.OptionTemplate
-
on_progressed
= <blinker.base.NamedSignal object at 0x7fc9f07e01d0; u'plugin:progressed'>¶
-
-
class
spreads.plugin.
SubcommandHooksMixin
[source]¶ Mixin for plugins that want to provide custom subcommands.
-
__init__
¶ x.__init__(...) initializes x; see help(type(x)) for signature
-
-
class
spreads.plugin.
CaptureHooksMixin
[source]¶ Mixin for plugins that want to hook into the capture process.
-
prepare_capture
(devices)[source]¶ Perform some action before capturing begins.
Parameters: devices (list of DeviceDriver
) – The devices used for capturing
-
capture
(devices, path)[source]¶ Perform some action after each successful capture.
Parameters: - devices (list of
DeviceDriver
) – The devices used for capturing - path (
pathlib.Path
) – Workflow path
- devices (list of
-
finish_capture
(devices, path)[source]¶ Perform some action after capturing has finished.
Parameters: - devices (list of
DeviceDriver
) – The devices used for capturing - path (
pathlib.Path
) – Workflow path
- devices (list of
-
__init__
¶ x.__init__(...) initializes x; see help(type(x)) for signature
-
-
class
spreads.plugin.
TriggerHooksMixin
[source]¶ Mixin for plugins that want to provice customized ways of triggering a capture.
-
start_trigger_loop
(capture_callback)[source]¶ - Start a thread that runs an event loop and periodically triggers
- a capture by calling the capture_callback.
Parameters: capture_callback (function) – The function that triggers a capture
-
stop_trigger_loop
()[source]¶ Stop the thread started by
start_trigger_loop()
.
-
__init__
¶ x.__init__(...) initializes x; see help(type(x)) for signature
-
-
class
spreads.plugin.
ProcessHooksMixin
[source]¶ Mixin for plugins that want to provide postprocessing functionality.
-
process
(pages, target_path)[source]¶ - Perform one or more actions that either modify the captured images
- or generate a different output.
Parameters: - pages (list of
spreads.workflow.Page
) – Pages to be processed - target_path (
pathlib.Path
) – Target directory for processed files
-
__init__
¶ x.__init__(...) initializes x; see help(type(x)) for signature
-
-
class
spreads.plugin.
OutputHooksMixin
[source]¶ Mixin for plugins that want to create output files.
-
output
(pages, target_path, metadata, table_of_contents)[source]¶ Assemble an output file from the pages.
Parameters: - pages (list of
spreads.workflow.Page
) – Project path - target_path (
pathlib.Path
) – Target directory for processed files - metadata (
spreads.metadata.Metadata
) – Metadata for workflow - table_of_contents (list of
spreads.workflow.TocEntry
) – Table of Contents for workflow
- pages (list of
-
__init__
¶ x.__init__(...) initializes x; see help(type(x)) for signature
-
-
spreads.plugin.
available_plugins
()[source]¶ Get the names of all installed plugins.
Returns: List of plugin names
-
spreads.plugin.
get_plugins
(*names)[source]¶ Get instantiated and configured plugin instances.
Parameters: names (unicode) – One or more plugin names Returns: Mapping of plugin name to plugin instance Return type: dict of unicode -> SpreadsPlugin
-
spreads.plugin.
available_drivers
()[source]¶ Get the names of all installed device drivers.
Returns: List of driver names
-
spreads.plugin.
get_driver
(driver_name)[source]¶ Get a device driver.
Parameters: driver_name (unicode) – Name of driver to instantiate Returns: The driver class Return type: DeviceDriver
class
-
spreads.plugin.
get_devices
(config, force_reload=False)[source]¶ Get initialized and configured device instances.
Parameters: - config (
spreads.config.Configuration
) – Global configuration - force_reload (bool) – Don’t load devices from cache
Returns: Device instances
Return type: list of
DeviceDriver
objects- config (
Central Workflow
entity (and its signals) and various associated
entities.
-
exception
spreads.workflow.
ValidationError
(message=None, **kwargs)[source]¶ Raised when some kind of validation error occured.
Attr message: General error message Attr errors: Mapping from field name to validation error message
-
class
spreads.workflow.
Page
(raw_image, sequence_num=None, capture_num=None, page_label=None, processed_images=None)[source]¶ Entity that holds information about a single page.
Attr raw_image: The path to the raw image. Attr processed_images: A dictionary of plugin names mapped to the path of a processed file. Attr capture_num: The capture number of the page, i.e. at what position in the workflow it was recorded, including aborted and retaken shots. Attr sequence_num: The sequence number of the page, i.e. at what position in the list of ‘good’ captures it is. Usually identical with the position in the containing pages list. Defaults to the capture number. Attr page_label: A label for the page. Must be an integer, a string of digits or a roman numeral (e.g. 12, ‘12’, ‘XII’). Defaults to the sequence number. -
__init__
(raw_image, sequence_num=None, capture_num=None, page_label=None, processed_images=None)[source]¶
-
get_latest_processed
(image_only=True)[source]¶ Get the least recent postprocessed file
Parameters: image_only (bool) – Only return image files (e.g. no OCR files) Returns: Path to least recent postprocessed file Return type: pathlib.Path
-
to_dict
()[source]¶ Serialize entity to a dict.
Used by
spreads.util.CustomJSONEncoder
.
-
-
class
spreads.workflow.
TocEntry
(title, start_page, end_page, children=None)[source]¶ Represent a ‘table of contents’ entry.
Attr title: Label/title of the entry Attr start_page: First page of the entry Attr end_page: First page no longer part of the entry - :attr children; Other
TocEntry
objects that designate a - sub-range of this entry
-
to_dict
()[source]¶ Serialize entity to a dict.
Used by
spreads.util.CustomJSONEncoder
.
- :attr children; Other
-
class
spreads.workflow.
Workflow
(path, config=None, metadata=None)[source]¶ Core entity for managing scanning workflows.
Attr id: UUID for the workflow Attr status: Current status. Keys are step
(‘capture’, ‘process’ or ‘output’),step_progress
(Progress as a value between 0 and 1) andprepared
(whether capture is already prepared).Attr path: Path to directory containing the workflow’s data. :type path;
pathlib.Path
:attr bag: Underlying BagIt data structure :type bag: py:class:spreads.vendor.bagit.Bag :attr slug: ASCIIfied version of workflow title without spaces. :attr config: Configuration for the worklfow, takes precedenceover the global configuration).Attr metadata: Metadata, contains at least a title
field.Attr pages: Pages available in the workflow Attr table_of_contents: Table of contents entries in the workflow Attr last_modified: Time of last modification Attr devices: Active devices Attr out_files: Generated output files -
classmethod
create
(location, metadata=None, config=None)[source]¶ Create a new Workflow.
Parameters: - location (unicode or
pathlib.Path
) – Base directory that the workflow should be created in - metadata (dict) – Initial metadata for workflow. Must at least contain a title item.
- config (dict or
spreads.config.Configuration
) – Initial configuration for workflow
Returns: The new instance
Return type: - location (unicode or
-
classmethod
find_all
(location, key=u'slug', reload=False)[source]¶ List all workflows in the given location.
Parameters: - location (unicode or
pathlib.Path
) – Location where the workflows are located - key (str/unicode) – Attribute to use as key for returned dict
- reload (bool) – Do not load workflows from cache
Returns: All found workflows
Return type: - location (unicode or
-
classmethod
find_by_id
(location, id)[source]¶ Try to locate a workflow with the given id in a directory.
Parameters: - location (unicode or
pathlib.Path
) – Base directory that contains workflows to be searched among - id – ID of workflow to be searched for
Return type: Workflow
or None- location (unicode or
-
classmethod
find_by_slug
(location, slug)[source]¶ Try to locate a workflow that matches a given slug in a directory.
Parameters: - location (unicode or
pathlib.Path
) – Base directory that contains workflows to be searched among - slug (unicode) – Slug of workflow to be searched for
Return type: Workflow
or None- location (unicode or
-
classmethod
remove
(workflow)[source]¶ Delete a workflow from the disk and cache.
Parameters: workflow ( Workflow
) – Workflow to be deleted
-
remove_pages
(*pages)[source]¶ Remove one or more pages from the workflow.
This will irrevocably remove the page metadata as well as all of its associated files, so use responsibly!
Parameters: pages ( Page
) – One or more pages to remove
-
crop_page
(page, left, top, width=None, height=None, async=False)[source]¶ Crop a page’s raw image.
Parameters: - page – Page the raw image of which should be cropped
- left – X coordinate of crop boundary
- top – Y coordinate of crop boundary
- width – Width of crop box
- height – Height of crop box
- async – Perform the cropping in a background thread
Returns: The Future object when
async
wasTrue
Return type: concurrent.futures.Future
-
classmethod
Metadata class and utility functions.
get_isbn_suggestions()
and get_isbn_metadata()
return a
dictionary with the following keys (which corresponds to the Dublin Core
field of the same name): creator, identifier, date, language.
-
spreads.metadata.
get_isbn_suggestions
(query)[source]¶ For a given query, return a list of metadata suggestions.
Parameters: query (unicode) – Search query Returns: List of suggestions Return type: list of dict
-
spreads.metadata.
get_isbn_metadata
(isbn)[source]¶ - For a given valid ISBN number (-10 or -13) return the corresponding
- metadata.
Parameters: isbn (unicode) – A valid ISBN-10 or ISBN-13 Returns: Metadata for ISBN Return type: dict or None if ISBN is not valid or does not exist
-
class
spreads.metadata.
SchemaField
(key, description=None, multivalued=False)[source]¶ Definition of a field in a metadata schema.
Attr key: Key/field name Attr description: Description of the field Attr multivalued: Whether the field can hold multiple values
-
class
spreads.metadata.
Metadata
(base_path)[source]¶ dict-like object that has a schema of metadata fields (currently hard-wired to Dublin Core) and persists all operations to a dcmeta.txt text file on the disk.
Configuration entities.
-
class
spreads.config.
OptionTemplate
(value, docstring=None, selectable=False, advanced=False, depends=None)[source]¶ Definition of a configuration option.
Attr value: The default value for the option or a list of available options if :py:attr`selectable` is True Attr docstring: A string explaining the configuration option Attr selectable: Make the OptionTemplate a selectable, i.e. value contains a list or tuple of acceptable values for this option, with the first member being the default selection. Attr advanced: Whether the option is an advanced option Attr depends: Make option dependant of some other setting (if passed a dict) or another plugin (if passed a string)
-
class
spreads.config.
Configuration
(appname=u'spreads')[source]¶ Entity managing configuration state.
Uses
confit.Configuration
underneath the hood and follows its ‘overlay’-principle. Proxies__getitem__()
and__setitem__()
from it, so it can be used as a dict-like type.-
__init__
(appname=u'spreads')[source]¶ Create new instance and load default and current configuration.
Parameters: appname – Application name, configuration will be loaded from this name’s default configuration directory
-
load_templates
()[source]¶ Get all available configuration templates from the activated plugins.
Returns: Mapping from plugin name to template mappings. Return type: dict unicode -> (dict unicode -> OptionTemplate
)
-
cfg_path
¶ Path to YAML file of the user-specific configuration.
Returns: Path Return type: pathlib.Path
-
with_overlay
(overlay)[source]¶ Get a new configuration that overlays the provided configuration over the present configuration.
Parameters: overlay ( confit.ConfigSource
or dict) – The configuration to be overlaidReturns: A new, merged configuration Return type: confit.Configuration
-
load_defaults
(overwrite=True)[source]¶ Load default settings from option templates.
Parameters: overwrite – Whether to overwrite already existing values
-
set_from_template
(section, template, overwrite=True)[source]¶ Set default options from templates.
Parameters: - section (unicode) – Target section for settings
- overwrite – Whether to overwrite already existing values
-
set_from_args
(args)[source]¶ Apply settings from parsed command-line arguments.
Parameters: args ( argparse.Namespace
) – Parsed command-line arguments
-
Various utility functions and classes.
-
exception
spreads.util.
MissingDependencyException
[source]¶ Raised when a dependency for a plugin is missing.
-
spreads.util.
find_in_path
(name)[source]¶ Find executable in $PATH.
Parameters: name (unicode) – name of the executable Returns: Path to executable or None if not found Return type: unicode or None
-
spreads.util.
is_os
(osname)[source]¶ Check if the current operating system matches the expected.
Parameters: osname – Operating system name as returned by platform.system()
Returns: Whether the OS matches or not Return type: bool
-
spreads.util.
check_futures_exceptions
(futures)[source]¶ - ” Go through passed
concurrent.futures._base.Future
objects - and re-raise the first Exception raised by any one of them.
Parameters: futures (iterable with concurrent.futures._base.Future
instances) – Iterable that contains the futures to be checked- ” Go through passed
-
spreads.util.
get_free_space
(path)[source]¶ Return free space on file-system underlying the passed path.
Parameters: path – Path on file-system the free space of which is desired. :type path; unicode :return: Free space in bytes. :rtype: int
-
spreads.util.
get_subprocess
(cmdline, **kwargs)[source]¶ Get a
subprocess.Popen
instance.On Windows systems, the process will be ran in the background and won’t open a cmd-window or appear in the taskbar. The function signature matches that of the
subprocess.Popen
initialization method.
-
spreads.util.
wildcardify
(pathnames)[source]¶ - Try to generate a single path with wildcards that matches all
- pathnames.
Parameters: pathnames – List of pathnames to find a wildcard string for Returns: The wildcard string or None if none was found Return type: unicode or None
-
spreads.util.
diff_dicts
(old, new)[source]¶ Get the difference between two dictionaries.
Parameters: Returns: A (possibly nested) dictionary containing all items from new that differ from the ones in old
Return type:
-
spreads.util.
slugify
(text, delimiter=u'-')[source]¶ Generates an ASCII-only slug.
Code adapted from Flask snipped by Armin Ronacher: http://flask.pocoo.org/snippets/5/
Parameters: Returns: The generated slug
Return type:
-
class
spreads.util.
abstractclassmethod
(func)[source]¶ - New decorator class that implements the @abstractclassmethod decorator
- added in Python 3.3 for Python 2.7.
-
class
spreads.util.
ColourStreamHandler
(stream=None)[source]¶ A colorized output StreamHandler
Kudos to Leigh MacDonald: http://goo.gl/Lpr6C5
-
class
spreads.util.
EventHandler
(level=0)[source]¶ Subclass of
logging.Handler
that emits ablinker.base.Signal
whenever a new record is emitted.-
on_log_emit
= <blinker.base.NamedSignal object at 0x7fc9f2385210; u'logrecord'>¶
-
-
spreads.util.
get_data_dir
(create=False)[source]¶ Return (and optionally create) the user’s default data directory.
Parameters: create (bool) – Create the data directory if it doesn’t exist Returns: Path to the default data directory Return type: unicode
-
spreads.util.
colorize
(text, color)[source]¶ Return text with a new ANSI foreground color.
Parameters: - text – Text to be wrapped
- color (str (from colorama.ansi <http://git.io/9qnt0Q>)) – ANSI color to wrap text in
Returns: Colorized text
-
class
spreads.util.
RomanNumeral
(value, case=u'upper')[source]¶ Number type that represents integers as Roman numerals and that can be used in all arithmetic operations applicable to integers.
-
static
is_roman
(value)[source]¶ Check if value is a valid Roman numeral.
Parameters: value (unicode) – Value to be checked Returns: Whether the value is valid or not Return type: bool
-
__init__
(value, case=u'upper')[source]¶ Create a new instance.
Parameters: value (int, unicode containing valid Roman numeral or RomanNumeral
) – Value of the instance
-
static
-
class
spreads.util.
CustomJSONEncoder
(skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None)[source]¶ Custom
json.JSONEncoder
.Uses an object’s to_dict method if present for serialization.
Serializes
pathlib.Path
instances to the string representation of their relative path to a BagIt-compliant directory or their absolute path if not applicable.
Core logic for application startup and parsing of command-line arguments
-
spreads.main.
add_argument_from_template
(extname, key, template, parser, current_val)[source]¶ Add option from template to parser under the name key.
Templates with a boolean value type will create a –<key> or –no-<key> flag, depending on their current value.
Parameters: - extname – Name of the configuration section this option’s result should be stored in
- key – Configuration key in section, will also determine the name of the argument.
- template (
spreads.config.OptionTemplate
) – Template for the argument - parser (
argparse.ArgumentParser
) – Argument parser the argument should be added to - current_val – Current value of the option
-
spreads.main.
run_config_windows
()[source]¶ Entry point to launch graphical configuration dialog on Windows.
-
spreads.main.
setup_logging
(config)[source]¶ Conigure application-wide logger.
Parameters: config ( spreads.config.Configuration
) – Global configuration
-
spreads.main.
setup_parser
(config)[source]¶ Sets up an
argparse.ArgumentParser
instance with all options and subcommands that are available in the core and activated plugins.Parameters: config ( spreads.config.Configuration
) – Current application configurationReturns: Fully initialized argument parser Return type: argparse.ArgumentParser
-
spreads.main.
should_show_argument
(template, active_plugins)[source]¶ Checks the
spreads.config.OptionTemplate.depends
attribute for dependencies on other plugins and validates them against the list of activated plugins.We do not validate dependencies on other configuration settings because we don’t have access to the final state of the configuration at this time, since the configuration can potentially be changed by other command-line flags.
Parameters: - template (
spreads.config.OptionTemplate
) – Template to check - active_plugins – List of names of activated plugins
Returns: Whether or not the argument should be displayed
- template (
Command-Line interface for configuration, capture, output and postprocessing.
-
spreads.cli.
getch
()[source]¶ Waits for a single character to be entered on stdin and returns it.
Returns: Character that was entered Return type: str
-
spreads.cli.
draw_progress
(progress)[source]¶ Draw a progress bar to stdout.
Parameters: progress (float) – Progress value between 0 and 1
-
spreads.cli.
configure
(config)[source]¶ Configuration subcommand that runs through the various dialogs, builds a new configuration and writes it to disk.
Parameters: config ( spreads.config.Configuration
) – Currently active global configuration
-
spreads.cli.
capture
(config)[source]¶ Dialog to run through the capture process.
Parameters: config ( spreads.config.Configuration
) – Currently active global configuration
-
spreads.cli.
postprocess
(config)[source]¶ Launch postprocessing plugins and display their progress
Parameters: config ( spreads.config.Configuration
) – Currently active global configuration
-
spreads.cli.
output
(config)[source]¶ Launch output plugins and display their progress
Parameters: config ( spreads.config.Configuration
) – Currently active global configuration
-
spreads.cli.
wizard
(config)[source]¶ Launch every step in succession with the same configuration.
Parameters: config ( spreads.config.Configuration
) – Currently active global configuration
Graphical configuration dialog.
-
class
spreads.tkconfigure.
TkConfigurationWindow
(spreads_config, master=None)[source]¶ Window that holds the dialog
-
__init__
(spreads_config, master=None)[source]¶ Initialize Window with global configuration.
Parameters: spreads_config ( spreads.config.Configuration
) – Global configuration
-
update_plugin_config
(plugins)[source]¶ - Update list of activated plugins and load its default
- configuration.
Parameters: plugins (list of unicode) – List of names of plugins to activate
-
on_update_driver
(event)[source]¶ Callback for when the user selects a driver.
Updates the driver in the configuration and toggles the status of widgets that depend on certain device features.
Parameters: event ( Tkinter.Event
) – Event from Tkinter
-
on_update_plugin_selection
(event)[source]¶ Callback for when the user toggles a plugin.
Tries to load the newly selected plugins. If loading fails, a dialog with the cause of failure will be displayed and the plugin will be highlighted in the list and made inactive. If successful, the plugin will be added to the ‘postprocessing order’ widget (if it implements
spreads.plugin.ProcessHooksMixin
) and the configuration will be updated.Parameters: event ( Tkinter.Event
) – Event from Tkinter
-
on_process_plugin_move
(event)[source]¶ - Callback for when the user changes the position of a plugin in
- the postprocessing order widget.
Updates the widget and writes the new order to the configuration.
Parameters: event ( Tkinter.Event
) – Event from Tkinter
-
set_orientation
(target)[source]¶ Set target page on a device.
Prompts the user to connect a device, prompts to retry or cancel on failure. If successful, updates the target page setting on the device.
Parameters: target (unicode, one of “odd” or “even”) – Target page to set on device
-
spreadsplug¶
spreadsplug package
This package contains all of the plugins and device drivers that are shipped with the application and supported by the spreads developers themselves.
In alphabetical order:
spreadsplug.autorotate
- Postprocessing plugin to rotate captured images according to their EXIF orientation tag.
spreadsplug.dev.chdkcamera
- Driver for Canon cameras with the CHDK firmware.
spreadsplug.dev.gphoto2
- Driver for cameras supported by libgphoto2
spreadsplug.dev.dummy
- Dummy driver that implements the driver interface and just spits out one of the two test images. Intended for rapid development, not for general usage.
spreadsplug.djvubind
- Output plugin to compress and bundle images (and OCRed text) into a single DJVU file using the djvubind utility.
spreadsplug.gui
- Subcommand plugin for a graphical wizard using Qt (via the PySide bindings)
spreadsplug.hidtrigger
- Trigger plugin to initiate a capture from USB HID devices (like foot-pedals or gamepads)
spreadsplug.intervaltrigger
- Trigger plugin to initiate a capture in a configurable interval.
spreadsplug.pdfbeads
- Output plugin to compress and bundle images (and OCRed text) into a single PDF file using the pdfbeads utility.
spreadsplug.scantailor
- Postprocesing plugin to put captured images through the ScanTailor application.
spreadsplug.tesseract
- Postprocessing plugin to perform optical character recognition on the images, using the tesseract application
spreadsplug.web
- Subcommand plugin for a RESTful HTTP API (implemented with Flask and Tornado) and a single-page JavaScript web application (implemented with ReactJS)
Trigger plugin that triggers in a configurable interval.
Plugin to provide a RESTful HTTP API and a single-page web application for controlling the software.
The code for the plug in is split across the following server-side modules:
spreadsplug.web.app
- Contains the subcommand hook as well as the initialization code for the web application.
spreadsplug.web.endpoints
- WSGI endpoints that provide most parts of the RESTful interface, implemented with Flask.
spreadsplug.web.handlers
- Tornado HTTP handlers for long-polling and chunked downloading endpoints, as well as a WebSocket handler for sending out server-side events to all clients.
spreadsplug.web.tasks
- Implementations of long-running tasks that are performed in the background, across multiple request-response-cycles, through the Huey task queue.
spreadsplug.web.discovery
- Code for both advertising of postprocessing-servers via UDP multi-casting, as well as the auto-discovery of said servers from other instances.
spreadsplug.web.util
- Various utility classes and functions for the plugin.
spreadsplug.web.winservice
- Code for a simple Windows service that runs the application in the background and provides a small taskbar-icon to allow opening a browser and shutting down the appplication.
For the documentation of the client-side part, please refer to the following document: TODO
HTTP API¶
The web plugin also exposes all of its functions through a REST-ish API. You can use it to write small scripts or even for a full-blown Android or iPhone application, if you feel so inclined.
-
GET
/api/remote/templates
¶ Get option templates for all available plugins from a remote server.
Behaves exactly like
GET /api/templates
.Query Parameters: - server – Hostname of remote server
Response Headers: - Content-Type – application/json
-
GET
/api/remote/discover
¶ Get list of available postprocessing servers on network.
Response Headers: - Content-Type – application/json
Request JSON Object: - servers (array) – List of available server addresses
-
POST
/api/system/shutdown
¶ Shut down device.
Requires that the user running the application has permission to run shutdown -h now via sudo. Note that this endpoint will never send a response, clients should take this into account and set a low timeout value.
-
GET
/api/remote/plugins
¶ Get available plugin names from a remote server, grouped by type.
Behaves exactly like
GET /api/plugins
.Query Parameters: - server – Hostname of remote server
Response Headers: - Content-Type – application/json
-
GET
/api/remote/config
¶ Get default configuration from a remote server.
Behaves exactly like
GET /api/config
.Query Parameters: - server – Hostname of remote server
Response Headers: - Content-Type – application/json
-
POST
/api/system/reboot
¶ Reboot device.
Requires that the user running the application has permission to run shutdown -r now via sudo. Note that this endpoint will never send a response, clients should take this into account and set a low timeout value.
-
GET
/api/templates
¶ For every activated plugin, get all option templates.
Response Headers: - Content-Type – application/json
-
POST
/api/workflow
¶ Create a new workflow.
Request Headers: - Accept – application/json
Request JSON Object: - config (object) – Configuration for new workflow
- metadata (object) – Metadata for new workflow
Response Headers: - Content-Type – application/json
Status Codes: - 200 OK – When everything was OK.
- 400 Bad Request – When validation of configuration or metadata failed.
-
GET
/api/workflow
¶ Return a list of all workflows.
Response Headers: - Content-Type – application/json
-
GET
/api/plugins
¶ Get names of available and activated postprocessing and output plugins.
Response Headers: - Content-Type – application/json
Response JSON Object: - postprocessing (array) – List of postprocessing plugin names
- output (array) – List of output plugin names
-
GET
/api/config
¶ Get global default configuration.
Response Headers: - Content-Type – application/json
-
PUT
/api/config
¶ Update global default configuration.
If core or web settings were modified, the application will be restarted.
Request Headers: - Content-Type – application/json
Response Headers: - Content-Type – application/json
-
POST
/api/reset
¶ Restart the application.
Note that this endpoint will never send a response, clients should take this into account and set a low timeout value.
-
GET
/api/isbn
¶ Search for ISBN records.
Query Parameters: - q – Search query
Response Headers: - Content-Type – application/json
Response JSON Object: - results (array) – Matching ISBN records
Status Codes: - 200 OK – When the query was successful
- 400 Bad Request – When no search query was supplied
-
GET
/api/log
¶ Get application log.
Query Parameters: - start – Index of first message (default: 0)
- count – Number of messages to return (default: 50)
- level – Maximum log level to be included in messages (default: INFO)
Response Headers: - Content-Type – application/json
Response JSON Object: - total_num (boolean) – Total number of messages
- messages (array) – Requested messages
-
GET
/api/workflow/
(workflow: workflow)/page/
(int: number)/
(img_type)/
(plugname)/thumb
¶ Get thumbnail for a page image.
-
GET
/api/workflow/
(workflow: workflow)/page/
(int: number)/
(img_type)/thumb
¶ Get thumbnail for a page image.
-
POST
/api/workflow/
(workflow: workflow)/page/
(int: number)/
(img_type)/crop
¶ Crop a page image in place.
-
GET
/api/workflow/
(workflow: workflow)/page/
(int: number)/
(img_type)/
(plugname)¶ Get image for requested page.
Parameters: - workflow (str) – UUID or slug for a workflow
- number (int) – Capture number of requested page
- img_type (str, one of raw or processed) – Type of image
- plugname (str) – Only applicable if img_type is processed,
selects the desired processed file by its key in the
spreads.workflow.Workflow.processed_images
dictionary.
Query Parameters: - width – Optionally scale down image to the desired width
- format – Optionally convert image to desired format. If browser is specified, non-JPG or PNG images will be converted to PNG.
Response Headers: - Content-Type – Depends on value of format, by default the mime-type of the original image.
-
GET
/api/workflow/
(workflow: workflow)/page/
(int: number)/
(img_type)¶ Get image for requested page.
Parameters: - workflow (str) – UUID or slug for a workflow
- number (int) – Capture number of requested page
- img_type (str, one of raw or processed) – Type of image
- plugname (str) – Only applicable if img_type is processed,
selects the desired processed file by its key in the
spreads.workflow.Workflow.processed_images
dictionary.
Query Parameters: - width – Optionally scale down image to the desired width
- format – Optionally convert image to desired format. If browser is specified, non-JPG or PNG images will be converted to PNG.
Response Headers: - Content-Type – Depends on value of format, by default the mime-type of the original image.
-
GET
/api/workflow/
(workflow: workflow)/output/
(fname)¶ Download an output file.
Parameters: - workflow (str) – UUID or slug for the workflow to download from
- fname (str) – Filename of the output file to download
Status Codes: - 200 OK – Everything OK.
- 404 Not Found – Workflow or filename not found
-
GET
/api/workflow/
(workflow: workflow)/page/
(int: number)¶ Get a single page.
Parameters: - workflow (str) – UUID or slug for a workflow
- number (int) – Capture number of requested page
Response Headers: - Content-Type – application/json
-
DELETE
/api/workflow/
(workflow: workflow)/page/
(int: number)¶ Remove a single page from a workflow.
-
POST
/api/workflow/
(workflow: workflow)/prepare_capture
¶ Prepare capture for the requested workflow.
-
POST
/api/workflow/
(workflow: workflow)/finish_capture
¶ Wrap up capture process on the requested workflow.
-
GET
/api/workflow/
(workflow: workflow)/download
¶ - Redirect to download endpoint (see
spreadsplug.web.handlers.ZipDownloadHandler
orspreadsplug.web.handlers.TarDownloadHandler
) with proper filename set.
Parameters: - workflow (str) – UUID or slug for the workflow to download
Query Parameters: - fmt – Archive format for download (zip or tar, default: tar)
Status Codes: - 302 Found – Redirects to :http:get:`/api/workflow/\ (str:workflow_id)/download/\ (str:workflow_slug).(str:archive_extension)`
-
POST
/api/workflow/
(workflow: workflow)/transfer
¶ Enqueue workflow for transfer to an attached USB storage device.
Requires that the python-dbus package is installed.
Once the transfer was succesfully enqueued, watch for the
spreadsplug.web.tasks.on_transfer_started
which is emitted when the transfer actually started and subsequentlyspreadsplug.web.tasks.on_transfer_progressed
andspreadsplug.web.tasks.on_transfer_completed
.Parameters: - workflow (str) – UUID or slug for the workflow to be transferred
Status Codes: - 200 OK – When the transfer was successfully enqueued.
- 500 Internal Server Error – When the python-dbus package was not found.
- 503 Service Unavailable – When no removable USB device could be found for mounting.
-
POST
/api/workflow/
(workflow: workflow)/capture
¶ Trigger a capture on the requested workflow.
Optional parameter ‘retake’ specifies if the last shot is to be retaken.
Returns the number of pages shot and a list of the pages captured by this call in JSON notation.
-
POST
/api/workflow/
(workflow: workflow)/process
¶ Enqueue the specified workflow for postprocessing.
-
POST
/api/workflow/
(workflow: workflow)/submit
¶ Enqueue workflow for submission to a postprocessing server.
It is possible to submit a configuration object that should be used on the remote end for the workflow. Optionally, it can be specified if postprocessing and output generation should immediately be enqueued on the remote server.
Once the submission was succesfully enqueued, watch for the
spreadsplug.web.tasks.on_submit_started
which is emitted when the submission actually started and subsequentlyspreadsplug.web.tasks.on_submit_progressed
,spreadsplug.web.tasks.on_submit_completed
andspreadsplug.web.tasks.on_submit_error
.Request Headers: - Accept – application/json
Parameters: - workflow (str) – UUID or slug for the workflow to be submitted
Request JSON Object: - server (string) – Address of server to submit to
- config (object) – Configuration to use for workflow on remote server.
- start_process (boolean) – Whether to enqueue workflow for post-processing on the remote server.
- start_output (boolean) – Whether to enqueue workflow for output generation on the remote server.
Status Codes: - 200 OK – When the transfer was successfully enqueued.
- 400 Bad Request – When no postprocessing server was specified
- 500 Internal Server Error – When the python-dbus package was not found.
- 503 Service Unavailable – When no removable USB device could be found for mounting.
-
POST
/api/workflow/
(workflow: workflow)/output
¶ Enqueue the specified workflow for output generation.
-
GET
/api/workflow/
(workflow: workflow)/page
¶ Get all pages for a workflow.
Parameters: - workflow (str) – UUID or slug for a workflow
Response Headers: - Content-Type – application/json
-
DELETE
/api/workflow/
(workflow: workflow)/page
¶ Delete multiple pages from a workflow with one request.
-
GET
/api/workflow/
(workflow: workflow)¶ Return a single workflow.
Parameters: - workflow (str) – UUID or slug for a workflow
Response Headers: - Content-Type – application/json
-
PUT
/api/workflow/
(workflow: workflow)¶ Update a single workflow.
Parameters: - workflow (str) – UUID or slug for the workflow to be updated
Request JSON Object: - config (object) – Updated workflow configuration
- metadata (object) – Updated workflow metadata
Response Headers: - Content-Type – application/json
Status Codes: - 200 OK – When everything was OK.
- 400 Bad Request – When validation of configuration or metadata failed.
-
DELETE
/api/workflow/
(workflow: workflow)¶ Delete a single workflow from database and disk.
Parameters: - workflow (str) – UUID or slug for the workflow to be updated
Status Codes: - 200 OK – When deletion was succesful
-
GET
/api/isbn/
(isbn)¶ Get metadata for a given ISBN number.
Parameters: - isbn (str/unicode with valid ISBN-10 or ISBN-13, optionally prefixed with isbn:) – ISBN number to retrieve metadata for
Response Headers: - Content-Type – application/json
Status Codes: - 200 OK – When the ISBN was valid and a match was found.
- 400 Bad Request – When the ISBN was invalid or no match was found.
-
GET
/static/
(path: filename)¶ Function used internally to send static files from the static folder to the browser.
New in version 0.5.
Changelog¶
0.5 (2014/03/??)¶
- A web interface that currently supports creating workflows, capturing images and downloading them as a ZIP file.
- New plugins to trigger capture across all interfaces: ‘hidtrigger’ for USB HID devices, ‘intervaltrigger’ to trigger a capture in regular intervals
- Use new, optimized JPEG processing library
- Plugin API now useses mixin classes to declare which hooks are implemented
- Made ‘chdkcamera’ driver more resilient
0.4.2 (2014/01/05)¶
- Fix packaging issues
- Small bugfix for older Tesseract versions
0.4.1 (2013/12/25)¶
- Fix ‘spread’ tool
- Include missing vendor package in distribution
0.4 (2013/12/25)¶
- Use chdkptp utility for controlling cameras with CHDK firmware
- Fix instability when shooting with CHDK cameras
- Shoot images in RAW/DNG file format (experimental)
- Remove download step, images will be directly streamed to the project directory
- Remove combine plugin, images will be combined in capture step
- Device driver and plugins, as well as their order of execution can be set interactively via the configure subcommand, which has to be run before the first usage.
- Lots of internal API changes
0.3.3 (2013/08/28)¶
- Fix typo in device manager that prevent drivers from being loaded
0.3.2 (2013/08/24)¶
- Fixes a critical bug in the devices drivers
0.3.1 (2013/08/23)¶
- Fixes a bug that prevented spreads to be installed
0.3 (2013/08/23)¶
- Plugins can add completely new subcommands.
- GUI plugin that provides a graphical workflow wizard.
- Tesseract plugin that can perform OCR on captured images.
- pdfbeads plugin can include recognized text in a hidden layer if OCR has been performed beforehand.
- Use EXIF tags to persist orientation information instead of JPEG comments.
- Better logging with colorized output
- Simplified multithreading/multiprocessing code
- CHDK driver is a lot more stable now
0.2 (2013/06/30)¶
- New plugin system based on Doug Hellmann’s stevedore package, allows packages to extend spreads without being included in the core distribution
- The driver for CHDK cameras no longer relies on gphoto2 and ptpcam, but relies on Abel Deuring’s pyptpchdk package to communicate with the cameras.
- Wand is now used to deal with image data instead of Pillow
- New ‘colorcorrection’ plugin allows users to automatically correct white balance.
- Improved tutorial
0.1 (2013/06/23)¶
- Initial release
About Spreads¶
spreads is a software suite for the digitization of printed material. Its main focus is to integrate existing solutions for individual parts of the scanning workflow into a cohesive package that is intuitive to use and easy to extend.
At its core, it handles the communication with the imaging devices, the post-processing of the captured material and its assembly into output formats like PDF or ePub. On top of this base layer, we have built a variety of interfaces that should fit into most use cases: A full-fledged and mobile-friendly web interface that works on even the most low-powered devices (like a Raspberry Pi, through the spreadpi distribution), a graphical wizard for classical desktop users and a bare-bones command-line interface for purists.
As for extensibility, we offer a plugin API that allows developers to hook into almost every part of the architecture and extend the application according to their needs. There are interfaces for developing a device driver to communicate with new hardware, for writing new postprocessing or output plugins to take advantage of a as of yet unsupported third-party software. There is even the possibility to create a completely new user interface that is better suited for specific environments.
The spreads core is completely written in the Python programming language, which is widespread, easy to read and to learn (and beautiful on top of that). Individual plugins also contain parts written in JavaScript and Lua. Through the web-plugin it also offers a REST(-ish) API that can be accessed with any programming language that has a HTTP library.
To get started with the software, we suggest you begin by reading the Introductory Notes that lay out the general workflow of the application and explain some of the terminology used across all interfaces. Then, if you want to install and configure the software yourself, head over to the Installation and Setup guide. If you are a user of the spreadpi distribution or plan on using it, use the spreadpi guide.
Note
In case you’re wondering about the choice of mascot, the figure depicted is a Benedictine monk in his congregation’s traditional costume, sourced from a series of 17th century etchings by the Bohemian artist Wenceslaus Hollar, depicting the robes of various religious orders. The book he holds in his hand is no accident, but was likely delibaretely chosen by the artist: The Benedictines used to be among the most prolific copiers of books in the middle-ages, preserving Europe’s written cultural heritage, book spread for book spread, in a time when a lot of it was in danger of perishing. spreads wants to help you do the same in the present day. Furthermore, the Benedictines were (and still are) very active missionaries, going out into the world and spreading ‘the word’. spreads wants you to do the same with your digitized books (within the boundaries of copyright law, of course).