Ph.D. Candidate in Computer Science and Cognitive Science at Indiana University.
Research Group: Signals & Artificial Intelligence Group in Engineering (SAIGE)
Email: ZHenK At IU doT EdU
I conduct research on audio and acoustic signal processing in the current deep/machine learning paradigm, with the focus on both model capacity and efficiency. Concretely, I've been working on cross-module residual learning that is compatible with both advanced, fast changing data-driven modules and conventional methodologies in audiology for lightweight speech coding. In terms of monaural speech enhancement, we proposed a hybrid architecture incorporating both CNN and RNN in a densely connected manner to enable dual-level context aggregation, efficiently. Besides, I worked a psychoacoustically weighting scheme to prioritize the model training towards an energy efficient speech denoising autoencoder. My supervisor is Prof. Minje Kim.
Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, and Minje Kim, "EFFICIENT AND SCALABLE NEURAL RESIDUAL WAVEFORM CODING WITH COLLABORATIVE QUANTIZATION" [Demo]
Kai Zhen, Mi Suk Lee, Minje Kim, "A DUAL-STAGED CONTEXT AGGREGATION METHOD TOWARDS EFFICIENT END-TO-END SPEECH ENHANCEMENT" [Demo]
Kai Zhen, Jongmo Sung, Mi Suk Lee, Seungkwon Beack, and Minje Kim, "Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding," In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH'19), Graz, Austria, September 15-19, 2019.
[PDF] [BibTex] [Demo]
Kai Zhen, Aswin Sivaraman, Jongmo Sung, Minje Kim. On Psychoacoustically Weighted Cost
Functions Towards Resource-efficient Deep Neural Networks for Speech Denoising.
[PDF] [BibTex] [US Patent App. 16/122,708]
Peter Miksza, Kevin Watson, Kai Zhen, Sanna Wager, Minje Kim. Relationships between experts' subjective ratings of jazz improvisations and computational measures of melodic entropy. In data analysis phase. Paper presented at the Improvising Brain III: Cultural Variation and Analytical Techniques Symposium, Atlanta, GA, in Feb, 2017.
Kai Zhen and David Crandall. Finding egocentric image topics through convolutional neural network based representations. In IEEE Conference on Computer Vision and Pattern Recognition Workshop on Egocentric Computer Vision, Las Vegas, 2016. (Poster).
I served as a reviewer of ICASSP 2019.
I served as a reviewer of EURASIP Journal on Advances in Signal Processing.
I served as a sub-reviewer of AAAI-2017, 2018.