Rafael Valle
Email:

I'm a fifth year PhD candidate at UC Berkeley advised by Prof. Sanjit Seshia and Prof. Edmund Campion. My research focuses on machine listening and improvisation. At UC Berkeley, I'm part of the TerraSwarm Research Center, where I work on problems related to adversarial attacks and verified artificial intelligence.

During Fall 2016 I was a Research Intern at Gracenote in Emeryville, where I worked on audio classification using Deep Learning. Previously I was a Scientist Intern at Pandora in Oakland, where I investigated segments and scores that describe novelty seeking behavior in listeners.

Before coming to Berkeley, I completed a master's in Computer Music from HMDK Stuttgart in Germany and a bachelor's in Orchestral Conducting from UFRJ in Brazil.

CV | LinkedIn | github

News
  • Paper about attacking speaker recognition with deep generative models is under review.
  • Paper about interesting properties of samples generated with GANs is under review.
  • Paper about sequence generation (text, speech, music) with GANs is under review.
Publications
sym sym

[NEW] Attacking Speaker Recognition with Deep Generative Models
Anish Doshi, Wilson Cai and Rafael Valle
under review, 2017

pdf | abstract

In this paper we investigate the ability of generative adversarial networks (GANs) to synthesize spoofing attacks on modern speaker recognition systems. We first show that the modern architectures of SampleRNN and WaveNet are unable to fool CNN-based speaker recognition systems. We propose a modification of the Wasserstein GAN objective function to make use of data that is real but not from the class being learned. Our method is able to perform both targeted and untargeted attacks against state of the art systems, which calls attention to issues related with security.

sym

[NEW] Interesting Properties of GAN Samples
Rafael Valle, Wilson Cai and Anish Doshi
under review, 2017

pdf | abstract

In this paper we investigate numerical properties of samples produced with adversarial methods, specially Generative Adversarial Networks. We analyze pixel value statistics of real and fake data and compute distances based on the marginal distribution of perceptually significant features. We provide results on MNIST, music and speech data and show that GAN generated samples have interesting signatures that can be used to identify the source of the data and detect adversarial attacks.

[NEW] Sequence Generation with GANs
Rafael Valle
under review, 2017

github | abstract | audio

In this paper we investigate the generation of sequences using generative adversarial networks (GANs). We open the paper by providing a brief introduction to sequence generation and challenges in GANs. We briefly describe encoding strategies for text and MIDI data in light of their use with convolutional architectures. In our experiments we consider the unconditional generation of polyphonic and monophonic piano roll generation as well as short sequences. For each data type, we provide sonic or text examples of generataed data, interpolation in the latent space and vector arithmetic.

sym

Audio-Based Room Occupancy Analysis using Gaussian Mixtures and Hidden Markov Models
Rafael Valle
Future Technologies Conference (FTC), 2016
Detection and Classification of Acoustic Scenes and Events , 2016

pdf | abstract | bibtex | arXiv | code

This paper outlines preliminary steps towards the development of an audio based room-occupancy analysis model. Our approach borrows from speech recognition tradition and is based on Gaussian Mixtures and Hidden Markov Models. We analyse possible challenges encountered in the development of such a model, and offer several solutions including feature design and prediction strategies. We provide results obtained from experiments with audio data from a retail store in Palo Alto, California. Model assessment is done via leave-two-out Bootstrap and model convergence achieves good accuracy, thus representing a contribution to multimodal people counting algorithms.

      @article{valle2016abroa,
        title={ABROA: Audio-Based Room-Occupancy Analysis using Gaussian Mixtures and Hidden Markov Models},
        author={Valle, Rafael},
        journal={arXiv preprint arXiv:1607.07801},
        year={2016}
      }
      
sym

Missing Data Imputation for Supervised Classification
Jason Poulos and Rafael Valle
ArXiv, 2016

pdf | abstract | bibtex | arXiv | code

This paper compares methods for imputing missing categorical data for supervised learning tasks. The ability of researchers to accurately fit a model and yield unbiased estimates may be compromised by missing data, which are prevalent in survey-based social science research. We experiment on two machine learning benchmark datasets with missing categorical data, comparing classifiers trained on non-imputed (i.e., onehot encoded) or imputed data with different degrees of missingdata perturbation. The results show imputation methods can increase predictive accuracy in the presence of missing-data perturbation. Additionally, we find that for imputed models, missingdata perturbation can improve prediction accuracy by regularizing the classifier.

      @article{poulos2016missing,
        title={Missing Data Imputation for Supervised Learning},
        author={Poulos, Jason and Valle, Rafael},
        journal={arXiv preprint arXiv:1610.09075},
        year={2016}
      }
      
sym

Learning and Visualizing Music Specifications using Pattern Graphs
Rafael Valle, Daniel Fremont, Ilge Akkaya, Alexandre Donze, Adrian Freed and Sanjit Seshia
ISMIR, 2016

pdf | abstract | bibtex | code

We describe a system to learn and visualize specifications from song(s) in symbolic and audio formats. The core of our approach is based on a software engineering procedure called specification mining. Our procedure extracts patterns from feature vectors and uses them to build pattern graphs. The feature vectors are created by segmenting song(s) and extracting time and and frequency domain features from them, such as chromagrams, chord degree and interval classification. The pattern graphs built on these feature vectors provide the likelihood of a pattern between nodes, as well as start and ending nodes. The pattern graphs learned from a song(s) describe formal specifications that can be used for human interpretable quantitatively and qualitatively song comparison or to perform supervisory control in machine improvisation. We offer results in song summarization, song and style validation and machine improvisation with formal specifications.

      @inproceedings{valle2016learning,
        title={Learning and Visualizing Music Specifications using Pattern Graphs},
        author={Valle, Rafael and Fremont, Daniel J and Akkaya, Ilge and Donze, Alexandre and Freed, Adrian and Seshia, Sanjit S},
        booktitleaddon= {Proceedings of the Seventeenth ISMIR Conference}        
        booktitle={ISMIR},
        year={2016}
      }