DIVAServices-Spotlight - Experimenting with Document Image Analysis Methods in the Web

paper, specified "long paper"
Authorship
  1. 1. Marcel Würsch

    Université de Fribourg

  2. 2. Michael Bärtschi

    Université de Fribourg

  3. 3. Rolf Ingold

    Université de Fribourg

  4. 4. Marcus Liwicki

    Université de Fribourg

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


Introduction
We present an easy-to-use web-based user interface which allows scholars working on manuscripts to assess the usefulness of automatic document image analysis (DIA) methods when incorporating automatic processes into their workflows. In contrast to existing web interfaces (Clausner, Pletschacher, and Antonacopoulos, 2011; Embach et al., 2013), this interface allows the user to directly upload images of the manuscript of interest without any registration. Thus, a fast assessment of a variety of algorithms and DIA processes can be performed. DivaServices is not a specialized tool for one specific use-case, but a collection of tools for several tasks in different use cases.

With this web interface we build on our previous initiative (Würsch, Ingold, and Liwicki, 2015) to provide access to a wide range of DIA methods to the research communities in Computer Science and the Humanities. While the existing DivaServices are already useful to integrate state-of-the-art DIA methods into new research applications it is still difficult for researchers with little programming experience to estimate the capabilities of the offered methods. For example, it is not easy to know which binarization, text line segmentation, or OCR method and which parameters would work best on a given manuscript.

In order to overcome this shortcoming, we present a web application that allows interacting with all offered methods. Users are able to upload their own images, perform experiments on them and have the results visualized. With this, researchers in the Humanities should get a better understanding on what the methods developed in the Computer Science community are able to achieve. The other way round, researchers from the Computer Science will have their methods exposed to a much broader range of data and can gather feedback to further improve the methods. This feedback loop should enhance communication between the two communities such that future methods can target the respective needs even better.
DivaServices-Spotlight is built in a highly modular way, providing Graphical User Interface (GUI) blocks for various kinds of input and output parameters. The web interface is therefore automatically updated when new methods are added without the need of any further ado. Furthermore, based on our continuous open source support, this tool is available under a LGPL v2.1 license and the source code can be downloaded from github.

Available at: https://github.com/DIVA-DIA/DIVAServices-Spotlight

DivaServices-Spotlight
DivaServices-Spotlight

Available at: http://divaservices.unifr.ch/spotlight
is a web application that allows user to upload their own image data, perform experiments and investigate the results. This should help in deciding whether an algorithm can help solving a particular problem or not, and which parameters are best for the data at hand. Figure 1 provides an example of such an experiment where the highlighted area (left) is segmented into text lines (right) and visualized for the user.

Figure 1 Example of an executed experiment using DivaServices. The highlighted region (left) is segmented into separate text lines (right) and visualized for the user.

In this part we give an overview of how the different user interface works and provide an example on how to perform a workflow.

The User Interface
From the welcome page the three main parts of DivaServices-Spotlight are available: Images, for uploading and manipulating input images; Algorithms, for executing methods; and Results, for accessing computed results.

Images
Via the “Images” link the user can view his already uploaded images (Gallery) or upload new images (Upload). In its current version all uploaded images are automatically converted into the PNG format and users can upload a maximum of ten images at the same time.
Users get the possibility to apply various pre-processing steps onto their image. It is possible to crop an image to a specific size and values such as
brightness, contrast, and
saturation can be adjusted. Performing these pre-processing steps can lead to better results of varying methods.

Algorithms
On the “Algorithm” page, all currently available methods are listed (c.f. Figure 2). When selecting “Apply” on a given method, the user is asked to select one of his uploaded images.

Figure 2 The “Algorithms” page provides an overview of all available methods with a short description of what they can be used for. Using the “Apply” button one method can be used.

On the page for a specific algorithm the user then has to specify input for this method. The input elements are created automatically based on the specifications of the method. For certain input elements a method can also specify ranges of possible values. The input of the user is validated and error messages are displayed should the input be not in a valid range.

Figure 3 Different input types and validations. DivaServices-Spotlight offers automatic generation of input blocks for different types of inputs (a) like numbers, strings, and selection. Automatic validation (b) ensures that the user input is within ranges specified by the method.

Figure 3 (a) shows how various input blocks are generated by DivaServices-Spotlight. Currently it is possible to generate blocks for the following elements: strings (textual data), numbers, selection (one of multiple), and checkboxes. In Figure 3 (b) validation of input elements is visualized. When the user inputs data that is not valid for the given input type (e.g. text data for numbers) an error message is displayed and the user cannot execute the method.

Furthermore, an algorithm can specify that a user needs to select a region within the image. This is needed by methods which want to only work on a subset of the image and can speed up the runtime, as well as the quality of the results of a method (e.g., of text line detection). DivaServices-Spotlight allows for drawing the following selections onto an input image: rectangle, polygons, and circles. These regions are drawn using the mouse. Rectangles and circles can be created using a simple click and drag operation. Polygons are created through manually creating every point of the polygon and clicking near the start point to close it. After creating the various highlighters, they can be edited (e.g., a single point of a polygon can be moved to a new location after creation). The various highlighters are visualized in Figure 4.

Figure 4 The different selection methods; rectangle (left), polygon (center), and circle (right).

Once the user has entered necessary parameters and selected a region on the image (if needed) the execution can be started using “Submit”. The user is notified of the process at the top of the page that shows more information when clicked on with the mouse (Figure 5 (a)). Once the execution is finished, again the user is notified by a small balloon that pops up in the top right corner (Figure 5 (b)). Also, the counter behind the “Results” link in the menu navigation is increased (Figure 5 (c)).

Figure 5 Notifications shown to the user about the current status of an execution (a), when an algorithm finishes (b), and the number of available results (c).

Results
The “Results” page provides the user an overview of all available results. Using the “+” button on a specific result will show him the computed result. On the left side the user sees the input image as well the used parameters and on the right side the user gets a visualization of the results.

Figure 6 Results of a text line segmentation method. User input (left) is shown together with the computed results (right). Below the images is the JSON information a programmer would receive when calling the methods on DivaServices directly.

Figure 6 provides an example of a detailed result. The user input is shown (left) with the computed result (right). The image view can be manipulated (dragging, and zooming) to get a better view of certain areas. Below each image is the JSON information that is sent to and received from DivaServices. This information should help programmers to see with what kind of information they have to deal with should they decide to integrate that method into another application.

Using DivaServices-Spotlight for Designing DIA Workflows

We provide an example how DivaServices-Spotlight can be used to design a full workflow. The aim is to build a system that takes an input image and performs OCR on the segmented text lines. For this we need to perform three steps: binarization, text line segmentation, and OCR recognition.

Using the “Save Image” functionality on the result page we save the result image after each step. Figure 7 (a) – (d) show results at each stage using a combination of available methods. Parameters or even method could be changed at each step in order to find the best suited combination.

Figure 7 Results at different stages in the workflow. The input image (a) is binarized (b), segmented into text lines (c) and processed using an OCR algorithm, leading to its digital representation (d).

Once a researcher is satisfied with the results on a small scale, he could then integrate that workflow into his application by directly invoking the methods on DivaServices using his programming language of choice.

Conclusion
With DivaServices-Spotlight we provide a web application to interact with all available methods hosted on DivaServices. Researchers can run small scale experiments to experience the possibilities of the different algorithms. Furthermore, the application provides developers with the necessary information they would need to use the methods outside of DivaServices-Spotlight and integrate them into other applications.

Bibliography
Clausner, C., Pletschacher, S., and Antonacopoulos, A. (2011). Aletheia – An advanced document layout and text ground-truthing system for production environments.
Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, pp. 48–52.

Embach, M., Krause, C., Moulin, C., Rapp, A., Rindone, F., Stotzka, R., … Vanscheidt, P. (2013). eCodicology-Algorithms for the Automatic Tagging of Medieval Manuscripts.
The Linked TEI: Text Encoding in the Web, pp. 172.

Würsch, M., Ingold, R., and Liwicki, M. (2015). DIVAServices – A RESTful Web Service for Document Image Analysis Methods. In
Digital Humanities.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2016
"Digital Identities: the Past and the Future"

Hosted at Jagiellonian University, Pedagogical University of Krakow

Kraków, Poland

July 11, 2016 - July 16, 2016

454 works by 1072 authors indexed

Conference website: https://dh2016.adho.org/

Series: ADHO (11)

Organizers: ADHO