Robert Rowe
Interactive Music Systems
The main contribution to my thesis is contained in chapter 5 on machine listening. This chapter is interesting because it outlines a taxonomy of sensing feature agents used with a machine listening program called Cyper. The domain over which the system operates is strictly with MIDI data, even so, it provides an interesting list of ways to interpret sensed information and an insight into the difficulties involved in extracting meaning from sensed data.
Interesting concepts:
Automatic thresholds
Focus
Decay
Manual assignment of thresholds
Summary:
Notes and Events are the main structure that the system analyzes. These structures contain information about the pitch, duration, and timing of actions.
Focus and Decay- This is the ability to change the context of what is evaluated by the system. The focus can be wide if values are changing over a large area of space and time or narrow if detection of the change requires analyzing a small spectrum of possibilities.
Decay- "the adjustment of focal scale over time"
Agents:
Register agent: "classifies the pitch range within which an event is placed". In cyper this is represented by two bits. Register as it is used here is for judgments like "its a high pitch range" or "its in the middle", or its "a low pitch".
An interesting application of register and focus is described that helps the system make judgments about register. In my vocabulary I would describe this as an automatic threshold technique. In Robert Rowe's book he determines the register by keeping track of the maximum and minimum pitch over a period of time. This determines the range over which register can be judged. If the range is less than two octaves register is classified into two ranges: high and low. If more it is classified into four.
Robert describes decay as a necessary complement to focus. Decay adjusts the range over which register is judged by contracting the range slowly over time if new data is not moving the endpoints outward from the min and max values. This contraction decay's timing and rate is determined empirically.
Dynamics Here, dynamics is the relative loudness of the composition events. Because of a wide variation in the way instrument interpret MIDI data, Each instrument must be thresholded differently. Robert uses MIDI data so he is limited to the information carried by MIDI events. This is an example of where an automatic thresholding algorithm can't be applied because of ambiguity. The ambiguity comes in the computer not knowing if the score is being played softly or the instrument has a small range of response.
Density: Vertical and Horizontal. horizontal is speed and vertical is "the number and spacing of events played simultaneously".
In vertical density focus and decay are ignored due to the low resolution of the measure.
There are more:
Attack Speed
Duration
Chord identification
Beat analysis and tracking: pulse (regular recurrence of undifferentiated events), meter (differentiates regularly recurring events), and rhythm (pattern of strong and weak beats)
Key Identification
Meter Detection
Higher levels:
Phrase finding
Regularity
The point is that evaluation and analysis of even very simple clear reliable data such as MIDI is complicated. But, even so, can a general method for analyzing sensor data be established that works across data types and meanings? Is there a cannon of techniques that underlies the analysis of sensed data?