Journal & conference papers

2018

  • C. d’Alessandro, L. Feugère, O. Perrotin, S. Delalez, B. Doval
    Le contrôle des instruments chanteurs
    14ème Congrès Français d’Acoustique (CFA ’18), Le Havre, 23-27 Avril 2018

2017

  • E. Amy de la Bretèque, B. Doval, L. Feugère, L. Moreau-Gaudry
    Liminal Utterances and Shapes of Sadness: Local and Acoustic Perspectives on Vocal Production among the Yezidis of Armenia
    Yearbook for Traditional Music, vol. 49, 2017, pp. 128–148
  • V. Goudard, H. Genevois, L. Feugère
    Stratégies de contrôle de la hauteur dans les instruments de musique numériques
    Revue Francophone d’Informatique et Musique [En ligne], Numéros, n° 5 – Informatique et musique : Recherche et Création 1, mis à jour le : 28/07/2017, URL : http://revues.mshparisnord.org/rfim/index.php?id=425.

    Show abstract

    Résumé: Dans de nombreuses cultures, la musique se construit sur des échelles de hauteurs qui jouent un rôle prédominant dans la lutherie. Mais la liberté de mapping (1) que permettent les ordinateurs, ainsi que la diversité des interfaces, donne lieu à une grande variété de stratégies pour le contrôle de la hauteur dans les instruments numériques. Sans prétendre être exhaustif, cet article tente de donner une vue d’ensemble, en proposant : 1) une revue des interfaces pour produire des hauteurs discrètes et/ou continues ; 2) une revue des stratégies de lutherie numérique proposant à l’instrumentiste un contrôle facilité et précis de la hauteur ; 3) des algorithmes développés par les auteurs concernant le contrôle continu de la hauteur ; 4) quelques comparaisons avec les instruments acoustiques. Enfin, un patch Max est disponible publiquement pour étayer cet article et permettre au lecteur de tester des stratégies de mapping présentées dans cet article.

    Hide abstract

  • L. Feugère, C. d’Alessandro, B. Doval, O. Perrotin
    Cantor Digitalis: chironomic parametric synthesis of singing (PDF)
    EURASIP Journal on Audio, Speech, and Music Processing, 2017:2

    Show abstract

    Abstract: Cantor Digitalis is a performative singing synthesizer that is composed of two main parts: a chironomic control interface and a parametric voice synthesizer. The control interface is based on a pen/touch graphic tablet equipped with a template representing vocalic and melodic spaces. Hand and pen positions, pen pressure, and a graphical user interface are assigned to specific vocal controls. This interface allows for real-time accurate control over high-level singing synthesis parameters. The sound generation system is based on a parametric synthesizer that features a spectral voice source model, a vocal tract model consisting of parallel filters for vocalic formants and cascaded with anti-resonance for the spectral effect of hypo-pharynx cavities, and rules for parameter settings and source/filter dependencies between fundamental frequency, vocal effort and formants. Because Cantor Digitalis is a parametric system, every aspect of voice quality can be controlled (e.g., vocal tract size, aperiodicities in the voice source, vowels, and so forth). It offers several presets for different voice types. Cantor Digitalis has been played on stage in several public concerts, and it has also been proven to be useful as a tool for voice pedagogy. The aim of this article is to provide a comprehensive technical overview of Cantor Digitalis.

    Hide abstract

2016

  • L. Feugère, C. d’Alessandro, S. Delalez, L. Ardaillon, A. Roebel
    Evaluation of singing synthesis: methodology and case study with concatenative and performative systems (Slides)
    Proc. Interspeech 2016, 1245-1249.

    Show abstract

    Abstract: The special session Singing Synthesis Challenge: Fill-In the Gap aims at comparative evaluation of singing synthesis systems. The task is to synthesize a new couplet for two popular songs. This paper address the methodology needed for quality assessment of singing synthesis systems and reports on a case study using 2 systems with a total of 6 different configurations. The two synthesis systems are: a concatenative Text-to-Chant (TTC) system, including a parametric representation of the melodic curve; a Singing Instrument (SI), allowing for real-time interpretation of utterances made of flat-pitch natural voice or diphone concatenated voice. Absolute Category Rating (ACR) and Paired Comparison (PC) tests are used. Natural and natural-degraded reference conditions are used for calibration of the ACR test. The MOS obtained using ACR shows that the TTC (resp. the SI) ranks below natural voice but above (resp. in between) degraded conditions. Then singing synthesis quality is judged better than auto-tuned or distorted natural voice in some cases. PC results show that: 1/ signal processing is an important quality issue, making the difference between systems; 2/ diphone concatenation degrades the quality compared to flat-pitch natural voice; 3/ Automatic melodic modelling is preferred to gestural control for off-line synthesis.

    Hide abstract

2015

  • L. Feugère, C. d’Alessandro
    Contrôle gestuel de la synthèse vocale. Les instruments Cantor Digitalis et Digitartic (Gestural control of voice synthesis: the Cantor Digitalis and Digitartic instruments) (PDF)
    Traitement du signal, 32(4), 417-442, 2015. DOI:10.3166/TS.32.417-442

    Show extended abstract

    Extended Abstract: Gestural control of speech or singing synthesis is difficult, because of the very fast articulators motions encountered in speech and singing. For singing, another difficult question is accurate rhythmic coordination and precision, because syllables must coincide with musical beats, at a given tempo. Then, the precise location of beats for different syllables must be controlled.
    Two singing synthesis instruments are presented: Digitalis Cantor and Digitartic. Both instruments use bimanual writing or drawing gestures on graphic tablets. The voice signal is computed with the help of a parametric synthesizer, including a voice source model, consonantal noise models and series/parallel formant filters. Cantor Digitalis is a vowel and semi-vowel singing instrument. Digitartic is an extension of Cantor Digitalis and allows for singing syllables, including plosives, fricative, liquid and nasal consonants. Any in-between canonical place of articulation is possible by linear interpolation of the consonant parameters, for each mode of articulation.
    In this paper, the focus is given on Digitartic through the issue of consonant gestures and musical beat synchronization. Three modes of Vowel-Consonant-Vowel (VCV) articulation are discussed according to three levels of rhythmic precision and musical context. A VCV articulation is composed of the onset phase (articulators approaching the position of maximum constriction), the medial phase (maximum of constriction) and the offset phase (constriction release). Offset phase of plosives is very short compared to other consonants.
    The first control mode consists of triggering the syllable at the beginning of the onset phase. However, when the syllable starts on the musical beat, it is perceived with a delay depending on the duration of articulation phases. Anticipating precisely this delay is very difficult. The second control mode of control allows for triggering the VCV dissyllable in two steps. In the first step, the onset phase is triggered, and in the second step, the offset phase is triggered. In this way, plosives can be synchronized with musical beats without any delay. The third control mode is a continuous control of the phases of articulation, without any triggering. This requires a fast synthesis engine, a high interface sampling rate, as well as an expert control gesture, fast and precise enough to reproduce speech articulation phases.
    The continuous control mode of articulation is performed by a back-and-forth gesture with the pen of the non-preferred hand, along the vertical dimension of the graphic tablet. Place of articulation is continuously controlled along the horizontal dimension, and the mode of articulation is assigned to different areas on the tablet. This back-and-forth gesture is analog to the somewhat symmetric articulation of the VCV dissyllable. The gesture amplitude allows for different degrees of articulation (hypoarticulation to hyperarticulation). Controlling durations of each phase of articulation is another mean to increase expressiveness. The preferred hand is controlling pitch, vocal effort and vowel quality on another graphic tablet. Then it is possible to modify pitch and vocal effort during each phase of articulation. Cantor Digitalis and Digitartic allow for expressive musical performances. They are regularly used for concerts.

    Hide abstract

    • L. Feugère, B. Doval, M.-F. Mifune
      Using pitch features for the characterization of intermediate vocal productions (Proceedings)
      5th International Workshop on Folk Music Analysis (FMA), Paris, June 10-12, 41-48, 2015. ISBN 979-10-95209-00-3

      Show abstract

      Abstract: This paper presents some pitch features for the characterization of intermediate vocal productions from the CNRS – Museée de l’Homme sound archives, in the context of the DIADEMS inter- disciplinary project gathering researchers from ethnomusicology and speech signal processing. Different categories – chanting, singing, recitation, storytelling, talking, lament – have been identified and characterized by ethnomusicologists and are confronted by acoustic analysis. A database totalizing 79 utterances from 25 countries spread around the world is used. Among the tested fea- tures, the note duration distribution has proved to be a relevant measure. Categories are mostly characterized by the proportion of 100-ms notes and the duration of the longest note. An evaluation of these features has been realized through a supervised classification using the different vocal categories. Classification results show that these two features allow a good discrimination between ”speech”, ”chanting” and ”singing”, but are not suited for discriminating between the ”speech” subcategories ”recitation”, ”storytelling” and ”talking”.

      Hide abstract

    2014

    • C. d’Alessandro, L. Feugère, S. Le Beux, O. Perrotin, A. Rilliard
      Drawing melodies : Evaluation of Chironomic Singing Synthesis
      J. Acoust. Soc. Am., 135(6), 3601-3612, 2014.
      (PDF Copyright (2014) Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America.)

      Show abstract

      Abstract: Cantor Digitalis, a real-time formant synthesizer controlled by a graphic tablet and a stylus, is used for assessment of melodic precision and accuracy in singing synthesis. Melodic accuracy and precision are measured in three experiments for groups of 20 and 28 subjects. The task of the subjects is to sing musical intervals and short melodies, at various tempi, using chironomy (hand-controlled singing), mute chironomy (without audio feedback), and their own voices. The results show the high accuracy and precision obtained by all the subjects for chironomic control of singing synthesis. Some subjects performed significantly better in chironomic singing compared to natural singing, although other subjects showed comparable proficiency. For the chironomic condition, mean note accuracy is less than 12 cents and mean interval accuracy is less than 25 cents for all the subjects. Comparing chironomy and mute chironomy shows that the skills used for writing and drawing are used for chironomic singing, but that the audio feedback helps in interval accuracy. Analysis of blind chironomy (without visual reference) indicates that a visual feedback helps greatly in both note and interval accuracy and precision. This study demonstrates the capabilities of chironomy as a precise and accurate mean for controlling singing synthesis.

      Hide abstract

    • V. Goudard, H. Genevois, L. Feugère
      On the playing of monodic pitch in digital music instrument
      40th International Computer Music Conference (ICMC) joint with the 11th Sound & Music Computing conference (SMC), Athens, September 14-20, 2014, 1418-1425.

      Show abstract

      Abstract: This paper addresses the issue of controlling monodic pitch in digital musical instruments (DMIs), with a focus on instruments for which the pitch needs to be played with accuracy. Indeed, in many cultures, music is based on discrete sets of ordered notes called scales, so the need to control pitch has a predominant role in acoustical instruments as well as in most of the DMIs. But the freedom of parameter mapping allowed by computers, as well as the wide range of interfaces, opens a large variety of strategies to control pitch in the DMIs. Without pretending to be exhaustive, our paper aims to draw up a general overview of this subject. It includes: 1) a review of interfaces to produce discrete and/or continuous pitch 2) a review of DMI maker strategies to help the performer for controlling easily and accurately the pitch 3) some developments from the authors concerning interfaces and mapping strategies for continuous pitch control 4) some comparisons with acoustical instruments. At last, a Max/MSP patch –publically available– is provided to support the discussion by allowing the reader to test some of the pitch control strategies reviewed in this paper.

      Hide abstract

    • L. Feugère, C. d’Alessandro
      Rule-based performative synthesis of sung syllables
      Proceedings of the International Conference on New Interfaces for Musical Expression, London, United Kingdom, June 30 – July 03, 2014, 86-87. ISBN: 978-1-906897-29-­1

      Show abstract

      Abstract: The special session Singing Synthesis Challenge: Fill-In the Gap aims at comparative evaluation of singing synthesis systems. The task is to synthesize a new couplet for two popular songs. This paper address the methodology needed for quality assessment of singing synthesis systems and reports on a case study using 2 systems with a total of 6 different configurations. The two synthesis systems are: a concatenative Text-to-Chant (TTC) system, including a parametric representation of the melodic curve; a Singing Instrument (SI), allowing for real-time interpretation of utterances made of flat-pitch natural voice or diphone concatenated voice. Absolute Category Rating (ACR) and Paired Comparison (PC) tests are used. Natural and natural-degraded reference conditions are used for calibration of the ACR test. The MOS obtained using ACR shows that the TTC (resp. the SI) ranks below natural voice but above (resp. in between) degraded conditions. Then singing synthesis quality is judged better than auto-tuned or distorted natural voice in some cases. PC results show that: 1/ signal processing is an important quality issue, making the difference between systems; 2/ diphone concatenation degrades the quality compared to flat-pitch natural voice; 3/ Automatic melodic modelling is preferred to gestural control for off-line synthesis.

      Hide abstract

    2013

    • L. Feugère, C. d’Alessandro, B. Doval
      Performative voice synthesis for edutainment in acoustic phonetics and singing: a case study using the “Cantor Digitalis” (postprint)
      5th International ICST Conference, INTETAIN, Mons, Belgium, 03/07-05/07, 2013. In Intelligent Technologies for Interactive Entertainment, Vol. 124, 169-178, 2013, Revised Selected Papers. Springer 2013 Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. ISBN 978-3-319-03891-9.

      Show abstract

      Abstract: A real-time and gesture controlled voice synthesis software is applied to edutainment in the field of voice pedagogy. The main goals are teaching how voice works and what makes the differences between voices in an interactive, real-time and audio-visual perspective. The project is based on “Cantor Digitalis”, a singing vowel digital instrument, featuring an improved formant synthesizer controlled by a stylus and touch graphic tablet. Demonstrated in various pedagogical situations, this application allows for simple and interactive explanation of difficult and/or abstract voice related phenomena, such as source-filter theory, vocal formants, effect of the vocal tract size, voice categories, voice source parameters, intonation and articulation, etc. This is achieved by systematic and interactive listening and playing with the sound of a virtual voice, related to the hand motions and dynamics on the tablet.

      Hide abstract

    • L. Feugère, C. d’Alessandro
      Digitartic: bi-manual gestural control of articulation in performative singing synthesis (postprint)
      13th International Conference on New Interfaces for Musical Expression, Daejeon, Korea Republic, May 27-30, 2013, 331-336. ISSN 2220-4806.

      Show abstract

      Abstract: Digitartic, a system for bi-manual gestural control of Vowel-Consonant-Vowel performative singing synthesis is presented. This system is an extension of a real-time gesture-controlled vowel singing instrument developed in the Max MSP language. In addition to pitch, vowels and voice strength controls, Digitartic is designed for gestural control of articulation parameters, including various places and manners of articulation. The phases of articulation between two phonemes are continuously controlled and can be driven in real time without noticeable delay, at any stage of the synthetic phoneme production. Thus, as in natural singing, very accurate rhythmic patterns are produced and adapted while playing with other musicians. The instrument features two (augmented) pen tablets for controlling voice production: one is dealing with the glottal source and vowels, the second one is dealing with consonant/vowel articulation. The results show very natural consonant and vowel synthesis. Virtual choral practice confirms the effectiveness of Digitartic as an expressive musical instrument.

      Hide abstract

    2012

    • L. Feugère, C. d’Alessandro
      Digitartic : synthèse gestuelle de syllabes chantées
      Journées d’Informatique Musicale (JIM 2012), Mons, Belgique, 09/05-11/05, 2012, 219-225. Unreferenced proceedings available online.

      Show abstract

      Résumé : Nous présentons le Digitartic, un instrument musical de
      synthèse vocale permettant le contrôle gestuel de l’articulation Voyelle-Consonne-Voyelle. Digitartic est situé dans
      la continuité de l’instrument de voyelles chantées synthétiques Cantor Digitalis, utilisant la synthèse par formants
      et développé dans l’environnement Max/MSP. Les analogies entre geste percussif et geste de constriction lors
      de la production de consonnes sont développées. Digitartic permet le contrôle temps réel de l’instant articulatoire,
      de l’évolution temporelle des formants, des bruits d’occlusion et de l’aspiration. Le lieu d’articulation peut varier continument par interpolation des lieux d’articulation des consonnes de référence. On discute du type de modèle de contrôle à utiliser suivant l’application recherchée, en s’appuyant sur des analogies gestuelles et des contraintes de temps réel. Un modèle de synthèse de l’articulation
      est présenté, utilisant les possibilités de contrôle d’une tablette graphique. Des exemples de syllabes synthétiques démontrent que le concept de contrôle gestuel de l’articulation, par analogie à la percussion, est valide.

      Hide abstract

    • S. De Laubier, G. Bertrand, H. Genevois, V. Goudard, B. Doval, L. Feugere, S. Le Beux, C. d’Alessandro
      OrJo et la Méta-Mallette 4.0
      Journées d’Informatique Musicale (JIM 2012), Mons, Belgique, 09/05-11/05, 2012, 227-232. Unreferenced proceedings available online.

      Show abstract

      Résumé: Cet article décrit le projet de recherche OrJo 2009-2012
      (Orchestre de Joysticks) qui associe quatre structures, PUCE MUSE, le LAM (UPMC), le LIMSI (CNRS, associé à l’UPMC et à l’Université Paris-Sud), et 3Dlized, autour de quatre grands objectifs : 1. réaliser quatre versions du logiciel plateforme pour s’adapter aux différents usages ; 2. proposer une collection d’instruments virtuels sonores et visuels ;
      3. améliorer la représentation graphique des instruments virtuels ; 4. pratiquer, échanger, et conserver un répertoire
      sur la Méta-Librairie. Ce projet interroge plusieurs usages nouveaux comme la pratique en orchestre d’instruments virtuels, l’échange et
      la transmission de partitions interactives pour ces orchestres, l’apport du relief pour la musique visuelle. Il
      évoque aussi les développements d’instruments virtuels comme les instruments de synthèse vocale du LIMSI et les instruments par modèles physiques, modèles topologiques et modèles statistiques de l’équipe Lutheries acoustique musique (LAM).
      Cet article est aussi une invitation à utiliser la plateforme Méta-Mallette via son SDK (libre) et le site
      d’échanges Méta-Librairie.

      Hide abstract

    2011

    • S. Le Beux, L. Feugère, C. d’Alessandro
      Chorus digitalis : experiment in chironomic choir singing
      12th Annual Conference of the International Speech Communication Association (INTERSPEECH 2011), Firenze, Italy, August 27-31, 2011, 2005-2008. ISSN 1990−9772.

      Show abstract

      Abstract: This paper reports on experiments in real-time gestural control of voice synthesis. The ability of hand writing gestures for controlling singing intonation (chironomic singing synthesis) is studied. In a first part, the singing synthesizer and controller are described. The system is developed in an environment for multi-users music synthesis, allowing for synthetic choir singing. In a second part, performances of subjects playing with the system are analyzed. The results show that chironomic singers are able to control melody with accuracy, to perform vibrato, portamento and other types of fine-grained intonation variations, and to give convincing musical performances.

      Hide abstract

    • L. Feugère, S. Le Beux, C. d’Alessandro
      Chorus digitalis : polyphonic gestural singing
      1st International Workshop on Performative Speech and Singing Synthesis (P3S 2011), Vancouver (Canada), March 14-15, 2011, 4p. Unreferenced printed proceedings.

      Show abstract

      Abstract: Chorus Digitalis is a choir of gesture controlled digital singers. Chorus Digitalis is based on Cantor Digitalis, a gesture controlled singing voice synthesizer, and the Méta-Mallette, an environment designed for collective electronic music and video performances. Cantor Digitalis is an improved formant synthesizer, using the RT-CALM voice source model and source-filter interaction mechanisms. Chorus Digitalis is the result of the integration of voice synthesis in the Méta-Mallette environment. Each virtual voice is controlled by both a graphic tablet and a joystick. Polyphonic singing performances of Chorus Digitalis with four players will be given at the conference. The Méta-Mallette and Cantor Digitalis are implemented using Max/MSP.

      Hide abstract