Towards the formalisation of structurally-coupling performance modalities and interfaces in human-machine interaction

Name: Orpheus Instituut
Price range: $

Article by Dario Sanfilippo

Share this article

tags: improvisation, cybernetics, human-machine interaction, feedback

If you put yourself in a situation of unpredictability and then find that it's completely possible to accept it, then you become an observer.

(David Tudor)

1.0 Introduction

In this article, we will investigate the relationships between human performers and autonomous machines within the context of live music and improvisation. First, we propose a framework for improvisation based on cybernetic criteria to structurally couple human and artificial entities as hybrid systems of mutually-determining components. The human and the machine become entangled through incessant and mutating relationships as adaptations to information derived from low-level and high-level sound characteristics. Then, we explore notions of cybernetics for the implementation of human-machine interfaces that allow the performer to reshape the network of relationships within the system's components with agility and organicity. Lastly, the article investigates the role of electroacoustic devices and environments within the context of human-machine interfacing in live performance.

2.0 Cybernetic improvisation

Improvisation represents a configuration where the human and the machine are coupled through a feedback loop: the two entities realise a higher-level system where the input is the human's auditory system, and the output is the sound generated by the machine through the actions of the human (Pressing 1984). Improvisation can be used to establish recursive loops among several entities, too. For instance, an improvisation ensemble, specifically an ensemble practising radical improvisation, can be seen as a network of humans where each agent is affected by its output and the output of the other agents, that is, a fully-connected network. When human performers operate with autonomous machines, the number of entities in the network is then the combination of humans and machines involved. For ensembles of human and machine performers, see, for example, (Sanfilippo et al. 2017).

Similarly, the practice of electronic music composition in the studio also resembles a feedback configuration, although the process takes place in a non-real-time domain. Namely, the composer working in the studio and following a standard workflow, for example, using software to generate and process sounds that are eventually arranged into pieces through editors and digital audio workstations, cycles through two main phases: modification and result-checking. The iterative process is repeated until the piece is complete, i.e., when it does not require further modifications.

The behaviour of musicians in radical improvisation setups, that is, setups where performers' actions are entirely unsupervised, are not formalised or defined according to any explicit rules. However, technically, the performers' actions always respond to the outputs generated by the agents involved in the process. Thus, despite improvisation allows for the highest degree of freedom, achieving musically compelling results can be challenging because of some inherent characteristics of feedback loops. These can result in behaviours that tend towards steady-state outputs, called attractors, or outputs that oscillate through a limited number of states, called periodic oscillations (Gleick 2011). The hypothesis is that the principles of feedback systems and chaos theory may also apply to feedback configurations involving individuals, such as the practices of improvised electronic music and electronic music composition in the studio, as explained above.

In structured improvisation, constraints or rules are set for the performers to drive the development of music towards specific directions, which are predefined paths that should guarantee a certain degree of musical complexity: these constraints or rules limit the performers' freedom. However, they can, on the other hand, avoid undesired outcomes where redundancy and predictability negatively affect the overall performance. These rules, though, are not necessarily related to the sonic context that the performers generate. In other words, the rules are absolute rather than relational.

Composers have explored techniques for relational rules and behaviours in improvisation, and an overview of some of these works can be found in (Dahlstedt et al. 2015}, where Palle Dahlstedt et al. also describe the group improvisation system that they have developed. Their system is a relational improvisation model with random elements based on subjective and high-level behavioural rules such as lead, support, opposition. However, these behaviours are somewhat arbitrary and abstract, and performers may have a substantially different interpretation of the modalities as they are not fully formalised, which can result in diverse networks of relationships. This may as well be the goal of the authors.

Another example of formalised improvised performance based on relational criteria can be found in (Murray-Rust et al. 2011). Murray-Rust and Smaill propose a model where actions are mediated through analysis functions of the musical surface to create a musical context. The musical context would then allow processing the musical outputs to construct a set of performative actions, similar to how processes occur in Speech Act theory. More generally, a thorough analysis of improvisation methods and models can be found in (Pressing 1988), where the concepts of feedback are considered fundamental for developing methods and models of the improvisation practice. The improvisation techniques described below follow the same direction.

The improvisation system proposed here follows a cybernetic approach where rules of interaction are well-defined and can then be applied to several aspects of sound at the timbral or formal level to achieve higher-level behaviours. Sounds can be analysed based on several low-level information criteria, some of which have been discussed in (Sanfilippo 2021a), suitable for this improvisation system. Among the standard low-level information measures, we have loudness and brightness. In the next session, where a performance project based on this system is presented, these low-level features will be used together with noisiness and a higher-level feature of sound events called density, which is related to the measurement of dynamicity, that provides an index of the number of recognisable variations per unit time. Other low-level criteria may include roughness and spectral spread.

(Heylighen et al. 2001) identify three main mechanisms of control in cybernetic and self-regulating systems; one mechanism is feedback. The two fundamental relationships for this improvisation system are the diverging and the converging functions. These, respectively, correspond to positive feedback and negative feedback behaviours, which are described. Examples of these behaviours concerning music performance are provided in the next session.

Another control modality in cybernetics that may be used in this approach is buffering. It consists of absorbing the perturbations received by a system through a damping mechanism. In the specific case of an improvising agent, a buffering function may be achieved by recording the incoming perturbations at regular intervals to set their states and transitioning smoothly among successive states. The interval would determine the dampening degree of the process, and the transitioning modality sets the linearity or nonlinearity of the process. Namely, a smooth and gradual transition would resemble a linear interpolation, while transitioning as-fast-as-possible to the next state would resemble an exponential variation.

The application of functions requires the analysis of each criterion and a quantification. The quantification can have different values according to the details required for each function. The measurement is likely to be based on subjective estimations of the agents, although computers may also be used to provide the performers with more accurate measurements. Moreover, the analysis can be performed on different sources, which ultimately determines the network topology.

The relational scores realised with this technique can be represented as a two-dimensional array or two-dimensional grid. The x-axis, or timing axis, is divided into sections of different lengths. The timing for each piece may follow different approaches, either objective ones using a centralised timer or individual references for each agent based on timing cues or arbitrary and subjective estimations. We will see an example in the next session. The y-axis represents the information criteria used for a piece. At the intersections between the elements of the axes, we have the functions that set the interaction modalities between information criteria and sound sources. Furthermore, each section can also contain indications on which source to analyse so that the network topology can dynamically change as the piece unfolds. In general, different kinds of if-then-else conditions can be set for each section so that the network's structure becomes adaptive and varies according to the sonic context.

2.1 Case study: Human Network: Machine Nostalgia (2016-2018)

Human Network: Machine Nostalgia is a score for three or more instrumental performers based on relational mechanisms taking place in the sound domain and distributed among the musicians. The functions and descriptors are the two main aspects of the score. Two behaviours, diverging and converging, describe the functions, which correspond to the characteristics of positive and negative feedback mechanisms concerning a state of equilibrium in a system.

The functions are applied to the descriptors, a set of four sonic characteristics describing a sound event from a specified source. The descriptors are loudness, brightness, noisiness, and density. The loudness refers to the intensity of a sound. The brightness indicates the overall spectral energy distribution, i.e., what register is predominant. The noisiness describes how noisy a sound is. The density, instead, refers to the number of individual sound events per unit time. The score is divided into sections of different durations. One of the two functions will be assigned to some of the four descriptors for each section, and different sonic sources will be assigned to each performer. To allow performers to handle more than one variable at a time, the descriptors will be assigned two values only: high and low. Thus, Performers will be asked to perform an analysis of the incoming sound and estimate one or more descriptors in real-time according to their perception and, based on the resulting values, they will act according to the function assigned to each descriptor.

Musicians are asked to analyse the descriptors based on one of the other performers, on all of the other performers as a whole (excluding themselves), or on the surrounding environment including the totality of the performers and the background noise. The source to be analysed will change from section to section, and the indication on the score will be as follows: ALL, referring to all the other performers; LEFT/RIGHT, referring to the performer to the left/right side; ENV, referring to the environment.

We have two functions in the score: the diverging function and the converging function. The diverging function represents the behaviour of a positive feedback mechanism, as mentioned earlier. The characteristic of this mechanism is to recursively strengthen the effects of perturbations that pushes a system away from equilibrium, in turn resulting in exponential deviations. In the music domain, this mechanism translates into producing sound events that move even further in the same direction as the value of the estimated descriptor to which the function is applied. For example, a diverging function applied to a low brightness means to produce a sound event whose brightness is as low as the analysed one or lower. The same kind of action applies to the other descriptors and the other value.

The converging function, instead, represents a negative feedback mechanism. It means that this type of action results in a counterbalancing behaviour with a tendency towards oscillations around a dynamical equilibrium point. Musically speaking, the performer applying this function will produce a sound event that goes towards the opposite direction of the estimated value of a descriptor. For example, applying this function to a high loudness means producing a sonic event with a low loudness. If the counterbalancing action is successful, the detected loudness will eventually turn into a low one, which the player will counterbalance with a high one, and so forth. On the score, the following symbols will be used for, respectively, the converging and diverging functions: ><; <>.

Considering that the analysis of several descriptors could be highly challenging for a performer, one way to simplify this task is to use constants for some of the descriptors. This way, the analysis task will be limited to a maximum of two descriptors per section, and the remaining descriptors will be handled using the constants high and low, corresponding to the estimation criteria used for the analysis of the descriptors. For example, a density feature with a low constant means that the performer will maintain their interpretation of a low density for the whole section. On the score, constants will be indicated with the words "high" and "low".

Durations are based either on a perceptual clock, given by the individual temporal estimation of each performer, or on a master clock, which will be the reference for all performers. For example, if the first section is one minute, in the first case, each performer will attempt to guess the correct time to switch to the next session. Alternatively, one or more synchronised stopwatches can be used.

The indeterminacy given by individual interpretations of the musicians makes this piece organic: subjective processing of the context is likely to be unpredictable and change each time the piece is performed, even when doing so with the same performers. Note that, unlike the work by Dahlstedt et al. where the network of relationships is subject to individual interpretations, here, we have rigorously formalised interaction modalities where the analysis of the context is subjective. When the estimation of a sonic feature in a section changes, a chain reaction triggers a series of changes in the other performers and reorganises the piece into new global dynamics. Furthermore, the subjective interpretation of time can have an even more profound effect, for the sections among the performers can overlap differently at each execution producing a higher-level restructuring of the network.

The score in Figure 1, which is only one of the many possible scores that can be generated with this mutable system of auditory relationships, was performed in the studio with a string trio of excellent musicians: Dimitris Papageorgiou (violin), Armin Sturm (double bass), Rus Wimbish (double bass). Despite recording several performances of the piece after many hours of rehearsing, there is currently no recording of the piece due to technical issues. Nevertheless, the performances were convincing, and it was fascinating to see somewhat different dynamical behaviours each time that the score was performed.

3.0 Cybernetic mapping

Mapping strategies for musical interfaces have been widely studied; see, for example, (Bowers et al. 2016; Mudd et al. 2019). Cybernetic mapping follows the same principles as those described in the previous section for cybernetic improvisation, although rather than interrelating performing agents, the cybernetic mapping approach binds the variables of DSP agents to positive and negative feedback relationships. More precisely, the cybernetic mapping strategy was developed for human-machine interaction performance with semi-autonomous systems. A semi-autonomous system is a machine with some degree of adaptation that cannot alter the network's structure enough to change the relationship among variables or agents.

The setup for the application of cybernetic mapping consists of a machine able to self-modulate its output and, to some extent, shape its formal developments, coupled with a human entity through a feedback loop provided by some improvisational modality. The role of the human entity, in particular, is to alter the adaptation modality and adaptation ranges in the system while the relationship between pairs of internal variables is maintained.

Information processing for adaptation can follow criteria based on perceptual models, or it can be based on abstract principles that are meaningful for the machine, depending on the specific goals set. See (Sanfilippo 2021b). These goals can rely on a specific characteristic of the sonic output, e.g., the amplitude, frequency, spectral distribution of energy, or noisiness. For example, if the goal is to keep the overall noisiness constant in a system with two agents, a bandpass filter and a frequency modulator (FM), the modulation index in the FM and the Q in the bandpass filter should be directly proportional. Similarly, if we have a saturator and an FM unit and we still want to keep a roughly constant noisiness, the amplitude of the saturator and the modulation index of the FM should be inversely proportional. Of course, this kind of relationship among variables does not guarantee that the system will always respond as expected, as these systems are highly nonlinear and subject to significant changes even with small perturbations on seemingly non-affecting parameters. Nonetheless, these criteria allow for a fundamental framework for building networks of relationships deriving from precise criteria.

Alternatively, instead of choosing relationships according to some desired characteristics of the overall output, it is possible to build the network of relationships considering a specific characteristic between pairs of variables, hence locally to let the global behaviour emerge. In this case, if we had a high-pass resonant filter with input gain, resonance, and cut-off parameters, regarding the characteristic of amplitude, input gain and cut-off counterbalance each other when directly proportional, while they contribute to an imbalance when inversely proportional: respectively, the first configuration tends towards equilibrium, while the second one tends to be far-from-equilibrium. The same kind of relationship occurs between resonance and cut-off, while input gain and resonance tend towards an imbalance when directly proportional, and towards balance when inversely proportional.

It is possible to have an arbitrary number of positive and negative feedback relationships in this type of network. A simple procedure to have approximately an equal number of positive and negative feedback relationships is to switch relationships every other pair of variables. For example, if we have a system with parameters A, B, C, D and positive feedback between A <-> B, then we will set negative feedback between B <-> C and again positive feedback between C <-> D, and so on. Lastly, the same set of variables can be mapped using the same principles over several chains of relationships, respectively representing different relationships based on different characteristics such as amplitude, spectral tendency, or noisiness. The human agent can then select the desired environment to operate and affect the system state variable and regions of adaptation. Furthermore, the environments can be interpolated to have a multi-dimensional space where it is possible to transition over different control chains.

4.0 Reduced intervention

The reduced intervention performance modality is the practical realisation of the concept of losing control to gain complexity (Sanfilippo 2020). This approach of human-machine interaction with autonomous systems deliberately seeks minimum interference so that the machine can fully express itself. Unlike the cybernetic improvisation approach in which the relational network is established especially at the low level, the reduced intervention modality aims at building relationships between the human and the machine that are functions of high-level information and analysis frames that extend over long periods. The human-machine interfacing, the trends displayed by dynamical behaviours of the system, and the physical conditions and the characteristics of the environment where the work is performed are all determinant factors that inform the strategy for the high-level interaction chain. The case study below follows these performance principles, and it shows how physical, technical, and aesthetic aspects of the work affect the control (or lack thereof) strategy. The analysis and examination that performers deploy concern the complexity in the evolutions of systems; the coherence and completeness of formal developments; the tendencies towards the emergence of sound or silence; the variations in the size of the edge of chaos regions (Sanfilippo 2021b); and more.

Despite the high-level nature of this approach, the reduced intervention performance can still be formalised following the cybernetic principles of buffering, positive feedback, and negative feedback or, more generally, through articulated or straightforward chains of if-then-else conditions applied to the analysis criteria mentioned above. Similarly, constraints can also drive the control process and the outcome towards specific targets or paths.

4.1 Case study: Single-Fader Versatility (2016)

Single-Fader Versatility is a human-machine interaction performance implementing the idea of cybernetic mapping and reduced intervention discussed in the previous section. The machine is a feedback network containing eight semi-autonomous agents that include audio processing techniques such as state-variable filtering, recursive nonlinear distortion, audio-driven granulation, sampling, pulse-width modulation, and frequency shifting. The agents are semi-autonomous and can thus self-modulate their internal variables within the ranges set by the cybernetic mapping. The human performer can rewire the network by switching among fully, quasi-full, circular, and diagonal (identity) topologies. The performer can also transition between open and closed configurations by modulating the amount of external signals from the environment flowing inside the network. Furthermore, the human performer can vary the output amplitude of each agent, which is a systemic action as it affects the strength of the recirculating signals in the feedback loops, and consequently, the system as a whole. Most importantly, the performer can operate a single fader to affect the variables of the agents and the range of adaptation and self-modulation.

For autonomous feedback systems, especially for closed systems with no interaction with the environment, a requirement for the realisation of the machine and its evolutions is self-oscillation. Arguably, autonomous music systems should be able to modulate the overall amplitude, even by attenuating it for extended periods. However, sufficient energy must recirculate within the network to ensure that sounds and evolutions can emerge continuously for an indefinite length. For this self-oscillating system, the cybernetic mapping among variables concerns the amplitude relationships. It aims to have an approximately equal number of positive and negative feedback relationships to maintain self-oscillation while having feedback coefficients slightly above the stability threshold. The system is then kept stable using adaptive compression units whose analysis modality can switch between RMS and peak measurements.

The performer attempts to influence the system as little as possible, and significant variations in the state variables take place when a dynamical behaviour has been explored sufficiently or when a drastic drop in the overall complexity of the output occurs. This project was last performed in Vienna, Austria, on the 29th of June 2019. Despite the low quality of the audio, the following video may still be helpful to understand the work (Link accessed on the 29th of August 2019):

5.0 Electroacoustic devices and the environment as interfaces

In recursive networks of interdependent components that interact in a nonlinear way, every single element has a potentially critical role that affects the global output of the system. The components of these networks establish a synergetic relationship in a highly distributed self-organising control structure from which emergent behaviours originate. Their intrinsic non-reductionist nature does not allow for analytical procedures where the components are observed individually. In doing so, the relationships among components would be bypassed, and the very essence of the system would be lost. "Linearity is a reductionist's dream, and nonlinearity can sometimes be a reductionist's nightmare," said Melanie Mitchell (Mitchell 2009). It is then not possible to quantify the contribution of single elements to global behaviours as the system can only operate and develop as a whole.

The nonlinearity property defined through the superposition principle is something that extends to different perspectives and observation scales. For example, if the state variable of a complex system at the initial condition is altered and subsequently reset to the original configuration, then the system's output is different from the initial one. Thus, complex systems show clear asymmetry as their past shapes their present, which shapes their future and demonstrate the inexorability of time and the irreversibility of the process (Prigogine et al. 1978). This nonlinearity protracts towards creating an interrelatedness of the variables in a system, which organically responds to modifications. For linear and non-recursive systems in the sound and music domain, amplitude-related variables are expected to affect amplitude-related characteristics of sound, and frequency-related variables are expected to affect frequency. Examples are the threshold of a dynamic compressor or the shift amount in frequency modulation. When these units become part of a nonlinear feedback network, variations in a single variable are likely to produce variations in all or most aspects characterising the system's output even without adaptation.

Similarly to how performing human agents become part of a more extensive network if inside the feedback loops, electroacoustic devices and the environment where the performance takes place also become extensions of a meta-system within which they acquire a systemic role. Microphones, loudspeakers, and the environment shape the system and, since "sound is the interface") (Di Scipio 2003), they serve as the core connection between the human and the machine or between the machine and itself. David Tudor and Alvin Lucier pioneered the use of microphones, loudspeakers, sound, and the environment as interfaces already in the '70s. Tudor realised his performance Microphone in 1973; Lucier realised Bird and Person Dyning in 1975. Analyses of these works can be found in (Sanfilippo 2012a; Sanfilippo et al. 2013).

5.1 Case study: Audible Icarus (2016-2018)

Audible Icarus is a human-machine interaction performance project initially conceived in 2012 (Sanfilippo 2012b) but developed and fully implemented in 2016. The performance utilises an autonomous ecosystem that is interfaced with a human agent through microphones, loudspeakers, and the environment. This project realises the concepts addressed in the preceding section about the reduced intervention performance modality. The recording of a live performance is available at the following link: https://soundcloud.com/dario-sanfilippo/audible-icarus.

Dario Sanfilippo · Audible Icarus

The DSP network consists of four adaptive agents based on the following processing techniques: granulation, sampling, reverberation, and pulse-width modulation. The information processing infrastructure has fixed modules for the extraction of information while the adaptation ranges and most of the variables in the agents are time-variant and dependent on measurements of low-level features from local signals in each agent. The agents do not implement digital feedback; hence, they do not self-oscillate unless they are coupled through the microphones, loudspeakers, and the environment. The loudspeakers, which can vary in number, are typically placed near the corners and pointed towards the walls to maximise and enhance the resonant characteristics of the environment. In addition, there are usually one or two microphones sending signals to the agents with different routing possibilities.

A fundamental aspect of the ecosystemic approach is calibration. Of course, in the ecosystemic approach, investigating different environments to achieve different behaviours is part of the creative practice, although these systems necessitate operating at optimal conditions to express their maximum potential. Non-equilibrium can be a source of order and self-organisation from disorder (Prigogine et al. 1978); complex systems, too, can benefit from far-from-equilibrium conditions. In self-oscillating systems, such a condition can be favoured by setting a minimally self-oscillating state, that is, a configuration of the feedback coefficients that allow for enough energy to recirculate without forcing the system towards any attractors. A state of minimal self-oscillation is optimal for the emergence of fluctuations and maximal sensitivity to perturbations. A high sensitivity, in turn, is particularly desirable in the ecosystemic approach as the environment and the observers become a source of perturbation themselves and, consequently, the origin of new dynamics.

One calibration procedure for Audible Icarus is to locate as many spots in the space as many agents in the DSP network. Each spot corresponds to each agent, calibrated individually, bypassing the other agents' output after placing the microphones in that position. The calibration aims to find input gains for the single agents so that minimal self-oscillating activity takes place. The performer interacts with the system by moving in the space, holding the microphones, and exploring resonances and anti-resonances in the environment to drive the system towards different dynamics. For a realisation of the piece, all loudspeakers are placed on one side of the space. The performer locates an area in the space far away from the loudspeakers that will be a non-self-oscillation zone. By moving closer to the loudspeakers following a straight line, the performer identifies successive spots for the calibration of the agents. Ideally, the spots should be equally distant, and the last spot should be relatively close to the loudspeakers. The result is what in a linear setup would appear as a path of progressively activating agents. Although, due to the nonlinear response of the agents, the exploration of their activation and mutual interference is highly nontrivial.

In this setup, the performer starts from the non-self-oscillation area and walks straight towards the loudspeakers until reaching a somewhat near position. The performer then walks back to the initial position, which is where the performance ends. The piece, hence the process of walking towards the loudspeakers and back, takes between 15 and 30 minutes, homogeneously distributed along the path. Concerning the emergence of sound, the performer follows a negative feedback response while walking towards the loudspeakers, and a positive feedback one while walking back. Practically, while the performer moves forward – an action that intrinsically favours the emergence of sound as it technically increases the feedback coefficient – the performer contrasts that emergence by pointing the microphones in different directions. On the other hand, when moving backwards – towards silence – the performer supports the emergence of sound by finding and holding resonant areas with the microphones.

6.0 Summary

We have discussed the relationships between humans and machines and the interfaces that bridge these entities in live performance. In this framework, performing with autonomous systems live is considered a higher-level whole that emerges from relentless and retroactive adaptations between the human and the machine: between their actions, reactions, and organisations. This framework requires an environment where low-level and high-level interactions can be formalised to connect the human and the machine. The idea of cybernetic improvisation is the realisation of such an environment. The principles of cybernetics for self-regulation and control in systems of interacting components constitute the processes that the human performer applies to their sonic context. On the other hand, the term improvisation emphasises that the performer's output is a continuous function of its input being processed through the cybernetic criteria.

This approach allows for the realisation of networks of interactions between humans and machines, or between several humans, based on simple principles that unfold into nontrivial behaviours due to circular nonlinearity and adaptation. The human performer becomes an audio and music analysis algorithm that responds to the sonic context according to rules and methods. The information extracted from the machine's output and environmental perturbations, in some circumstances, can be low-level or high-level. In the first case, the improvisation modalities are applied to characteristics such as loudness, brightness, and noisiness. For high-level characteristics, instead, we have measurements such as event density, dynamicity, and complexity. The relational criteria are represented by the positive and feedback mechanisms and other control operations such as buffering and delaying.

The cybernetic mapping approach is based on the same principles described above. It is used to interrelate the variables in a DSP network systemically to allow the performer to interact with digital audio networks in an organic and agile way. This technique identifies the fundamental sonic characteristics connected to DSP variables to create positive and negative feedback links. Typically using one-to-many mapping strategies with different mapping functions, the performer can operate on a single parameter to reshape the characteristics of the DSP network while maintaining the relationships among DSP agents. Alternatively, this approach can allow the performer to create relational biases by shifting the feedback relationship of DSP variables couples from positive to negative and vice versa.

The article concludes with a section on reduced intervention performance modalities, which are a practical application and consequence of a novel formulation of musicianship, and with an analysis on how the performance space and the electroacoustic devices can be deployed as interfaces to explore the aesthetics of the machine.

7.0 References

Bowers, John, et al. (2016). One Knob To Rule Them All: Reductionist Interfaces for Expansionist Research. Proceedings of the International Conference on New Interfaces for Musical Expression, Queensland Conservatorium Griffith University, pages 433–438.

Di Scipio, A. (2003). ’sound is the interface’: from interactive to ecosystemic signal processing. Organised Sound, 8(3):269–277.

Dahlstedt, P., Nilsson, P. A., and Robair, G. (2015). The bucket system-a computer-mediated signalling system for group improvisation. In NIME, pages 317–318.

Gleick, J. (2011). Chaos: Making a new science. Open Road Media.

Heylighen, F. and Joslyn, C. (2001). Cybernetics and second-order cybernetics. Encyclopedia of physical science & technology, 4:155–170.

Mitchell, M. (2009). Complexity: A guided tour. Oxford University Press.

Mudd, Tom, Simon Holland, and Paul Mulholland. Nonlinear dynamical processes in musical interactions: Investigating the role of nonlinear dynamics in supporting surprise and exploration in interactions with digital musical instruments. International Journal of Human-Computer Studies, 128, pages 27-40.

Murray-Rust, D. and Smaill, A. (2011). Towards a model of musical interaction and communication. Artificial Intelligence, 175(9-10):1697–1721.

Pressing, J. (1984). Cognitive processes in improvisation. In Advances in Psychology, volume 19, pages 345–363. Elsevier.

Prigogine, I. (1978). Time, structure, and fluctuations. Science, 201(4358):777–785.

Sanfilippo, D. (2012a). Osservare la macchina performante e intonare l’ambiente: Microphone di David Tudor. Le Arti del Suono, 1(6):70–82.

Sanfilippo, D. (2012b). Lies (distance/incidence) 1.0: a human-machine interaction performance. In Proceedings of the 19th Colloquium on Music Informatics, pages 198–199.

Sanfilippo, D. (2020). Complex musical behaviours via time-variant audio feedback networks and distributed adaptation: a study of autopoietic infrastructures for real-time performance systems. PhD dissertation, The University of Edinburgh.

Sanfilippo, D. (2021a). Time-domain adaptive algorithms for low-level and high-level audio information processing. Computer Music Journal, 45(1):(forthcoming).

Sanfilippo, D. (2021b). Complex Adaptation in Audio Feedback Networks for the Synthesis of Music and Sounds. Computer Music Journal, 45(1):(forthcoming).

Sanfilippo, D. and Di Scipio, A. (2017). Environment-mediated coupling of autonomous sound-generating systems in live performance: An overview of the Machine Milieu project. In Proceedings of the 14th Sound and Music Computing Conference, Espoo, Finland, pages 5–8.

Sanfilippo, D. and Valle, A. (2013). Feedback systems: An analytical framework. Computer Music Journal, 37(2):12–27.