interHist - an interactive visualization for statistically enhanced query structures

© Copyright 2012-2013 Accademia Europea Bolzano

Show explanation

Hide explanation

This is a sample application of interHist, an interactive visualization for statistically enhanced query structures. The demo shows results to a query for noun phrases of Italian. The data is taken from the free PAISÀ Corpus of Italian web texts (
The structure of complex noun phrases in Italian (cf. Renzi, 1991) is approximated by the following query:

  predet? (det | dem/poss pronoun)? adj* noun (adj | verb ending in ti/te/to/ta)?

It yields more than 24 million results that are condensed into the interHist visualization based on the parts-of-speech (see here for details on the tagset) to each concordance line.

The x-axis displays, as stacked histograms, the part-of-speech distributions per token position. According to the linear order of token sequences, the information is placed from left to right. Part-of-speech types are encoded by color. Hovering over a bar in the histogram highlights the respective part-of-speech label in the legend to the right. The total number of results is displayed above the diagram.
The visualization allows for interactive filtering of the data. By clicking on a bar the respective token position is restricted to the selected part-of-speech. The filtered results are visualized as second sequence of stacked histograms next to the primary data.

Tips for using interHist: