An Approach to Adding Knowledge Constraints to a Data-driven Generative Model for Carnatic Rhythm Sequence
Computational models for generative music have been a recent trend in AI based technology developments. However, an entirely data-driven strategy often falls short of capturing the naturally occurring rhythmic grouping. Guedes et al. [1, 2] had proposed dictionary based and stroke-grouping based approaches to generate novel sequences in the 8-beat cycle of Aditala. More recently an attempt of incorporating arithmetic partitioning, as conceived by performers, was made  to get rid of the drawback of the former model being failure to capture long-term structure and grammar of this particular idiom and being only successful in capturing local and short-term phrasing. One way of solving this issue would be to consider a rhythmic phrase as a gestalt i.e. to hypothesize three rationales:
(i) a sequence of strokes, when played in a faster speed, behaves as an independent unit and not a mere compressed version of the reference;
(ii) context influences the accent – the same phrase is played differently when as a part of a composition versus as a filler (ornamentation) during improvisation;
(iii) phrases show a co-articulation effect – the gesture differs in anticipation of the forthcoming stroke/pattern. Initial experiments show that a time-compressed version of the reference phrase played in 4x speed sounds perceivably different from the same reference phrase played at 4x speed by the same musician. This indicates that there is a gestural difference in articulating the same phrase at different speeds. We extract timbral features to understand the differences, though there is a context-dependence that seems to have been captured in a supra-segmental way, motivating us to investigate prosodic features. This indicates that a syntactically correct sequence may not serve as a semantically plausible one to a musician’s expectancy. As the qualitative evaluation of CAMel  involves expert listening, we believe, adding the proposed knowledge constraints would add to the naturalness, hence acceptability, of the generated sequences.