ganesh-srinivas / keybase.md

Created December 21, 2018 08:05

Keybase proof

I hereby claim:

I am ganesh-srinivas on github.
I am gsrinivas (https://keybase.io/gsrinivas) on keybase.
I have a public key ASBDA9vJrkbic6qLa_R93POWmmLCcuxmF_pKgkeo32ZGXwo

To claim this, I am signing this object:

ganesh-srinivas / proposal-dark-data-extraction-research.md

Last active November 24, 2022 15:12

Proposal for Dark Data Extraction Research

This document will document progress, ideas and source code for dark data extraction systems. These systems use statistical inference to perform data extraction, integration and cleaning from unstructured/"dark" sources (forum posts, webpages, etc.). Data programming is the predominant paradigm for dark data extraction: noisy/conflicting user-defined functions are supplied to a generative model, which can recover the parameters of labelling process. Wherever possible, my projects are based on Snorkel/DeepDive.

Ideas (Extensions for the system):

There isn't any work on Domain Specific primitives (DSPs) for audio data. Pre-trained audio models (VGGish) can serve as feature extractors for high-level concepts like emotion, accent and personality for speech data(WaveNet paper mentions that these are possible), musical genre (Sander Dieleman's Spotify CNN blog post), etc.

Ideas (Applications):

Ecological/Environmental monitoring: use audio DSPs for building models of migration, logging/poaching, etc.

ganesh-srinivas / gsoc_redhenlab_laughter_categorization.md

Last active September 29, 2017 06:53

GSoC 2017 - Red Hen Lab - Learning Embeddings for Laughter Categorization - Work Product Submission

Learning Embeddings for Laughter Categorization

https://github.com/ganesh-srinivas/laughter/

UPDATE: This project was deemed successful, and I received a very positive evaluation from my mentors! :-) (you can view it at http://ganesh-srinivas.github.io/gsoc_final_evaluation.pdf)

The main deliverables from this project are machine learning classifiers that can perform laughter detection and categorization: identify if an audio clip contains laughter or not, and categorize the laughter (giggle, baby laugh, chuckle/chortle, snicker, belly laugh).

Model Architecture	Input Feature	Output pooling	Test set metrics