Responsive Text

About

A screenshot of a slightly-resized Alice

Responsive images are images that, through progressive retargeting (manually or otherwise) respond to their context by resizing in a vaguely intelligent manner. This script reads a CSV file with salience measures for passages of text, and outputs a HTML file that hides or shows various parts of the text according to browser width.

For example, this sample was built using a log-likelihood annotated excerpt from alice in wonderland (every word was tagged). A more conventional summary system will use sentence-level tagging, but I had this input already done from another project, so in it went. Note that because the input was tagged to avoid structural elements (non-word tokens), it still outputs punctuation and newlines even if the associated word is removed. Tagging by sentence would clearly fix this.

For usage, check out the section below. You'll basically need some way of marking up sentence importance to use the tool, though the sample input and output is included in the tarball. This tool is designed to form part of an existing system currently in development as part of my PhD in computational linguistics, but all updates, contributions and ideas are welcome.

Many thanks to Frankie Roberto, who first posted this method of hiding/showing text based on browser size (Until now I'd been using a java-based viewer, which sucked).

Download

Download ResponsiveText.tar.gz. Includes some sample input and output.

Use

The script simply converts one-part-per-row CSV files into HTML with hide-show CSS media rules. The way I'd envisaged its use is as an output stage:

  1. Separation of formatting elements (for use with ignore column setting below)
  2. Sentence segmentation
  3. Salience calculation
  4. Rendering (via this tool)

The script itself has plenty of documentation in the source, or you can simply run it without any arguments:

$ ruby responsive.rb 
USAGE: responsive.rb CSV WORDCOL SCORECOL [OUTFILE] [LEVELS] [MINWIDTH] [MAXWIDTH] [IGNORECOL]

Parameters
----------
 CSV        : The CSV file to use as input
 WORDCOL    : The column header holding content to output
 SCORECOL   : The column header for the (numeric) salience score
 OUTFILE    : Name of the output HTML file.
 LEVELS     : Number of levels to use (granularity)
 MINWIDTH   : Width to show almost nothing (in px)
 MAXWIDTH   : Width to show everything (in px)
 IGNORECOL  : If this column is ""/false, will not bother to process
              that token (but will still output it).  Used for formatting.