This visualization shows any type of structured information (e.g. ngrams) and associated information using our extension of Parallel Coordinates called Structured Parallel Coordinates. The vertical axes represent the structured data as well as the associated fields. The red vertical bar separates the structured data from the other fields. Each record (e.g. an ngram and its associated data) is represented by a single line connecting the appropriate points on the axes.
One sample data is from German: [20-29] [space or hyphen or nothing or combination] [jähriger], where jähriger is omitted in the display as it is the same for all records. The axes are labelled with p1 and p2 for the two tokens of the ngram and with count for the absolute count of the ngram; percentP1 indicates the percentage of the selected ngram with respect to p1 and percentP2 indicates the percentage of the selected ngram with respect to p2. One result to look for is that alternatives that use a hyphen are almost always somewhat over half of the uses with a given number, than the alternatives that use no connection are almost always somewhat under half of the uses.
A second example is from the UKWAC100M web corpus of British English. It shows the distribution of PRONOUN + BE + happy/sad, e.g. "I was happy", "We were sad" etc. One result is that "he was happy" appears more often than "she was happy", but "she is happy" more often (percentage-wise with respect to all occurrences of the pronoun) than "he is happy". But "he's happy" much more often than "she's happy". Hmmm.
Tips for using Structured Parallel Coordinates