論文メモ_1


VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
複数話者重畳音声から、特定話者の音声のみを分離する手法を提案。抽出したい話者のd-vectorと複数話者重畳音声を入力し、理想的なmaskを計算。

 

Learn Spelling from Teachers: Transferring Knowledge from LanguageModels to Sequence-to-Sequence Speech Recognition
seq-to-seq ASR modelを学習する際、別途学習したRNNLMで計算した書き起こしに対する事後確率をsoft-labelとして利用。外部LMからknowledge distillationを行う。

 

Robust neural machine translation with doubly adversarial inputs
NMTのためのadversarial exampleの生成方法を提案。encoderへの入力、decoderへの入力、いずれもadversarial inputを計算。

 

Fine-grained analysis of sentence embedding using auxiliary prediction tasks
CBOWおよびencoder-decoderでword embeddingを求め、それらがlength testやcontent test, order testといったlow level taskでどのような性能を示すか調査。(There is clearly more to be done in this area, but the authors do a good job shedding some light on what sentence embeddings can encode. We need more work like this that helps us understand what neural networks can model.)