Related Papers in ICLR 2020 (2020.04.26)
Anomaly
- 已读 Deep Semi-Supervised Anomaly Detection - Lukas Ruff, Robert A. Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, Marius Kloft 
- 已读Iterative energy-based projection on a normal data manifold for anomaly localization - David Dehaene, Oriel Frigo, Sébastien Combrexelle, Pierre Eline - 任务:文章中说是异常定位,其实也可以看做异常归因,且针对以VAE/AE为基本结构的异常检测器均有效(个人理解对使用ONE CLASS方法训练、借助重建误差计算异常得分的方法均有效)。 - 基本流程: - 本文假设使用一个VAE结构做ONE CLASS训练,即建模正常数据的重建与分布。Loss函数为: $L = L_{reconstruction} + L_{KL(p,q)}$ 
- 由于上述的loss函数最小化了网络对于正常样本的相应。因此,针对任意一个异常样本$x_a$,将网络对其的相应$L(x)$做最小化,即可将$x_a$转换它的正常版本$x_n$。在这种假设下,网络的loss函数即为样本的异常得分。 
- 定义异常定位的最优化目标$E = L$,由于有工作表明$L_{KL}$对计算异常得分有负面作用,因此将loss函数中的此项去掉。即$E = L_{reconstruction}$。 
- 由于定位的目标是仅使得样本中异常的部分被指出。而非改变整个样本的形态到类似训练样本的样子,所以要加上正则项,控制正常版本的样本与原异常样本之间的相似度,即 $L_{regularization} = \left| x_n - x_a \right| 0 $,然而由于L0不可导,因此使用L1作为近似,即$L{regularization} = \left| x_n - x_a \right|1 $。即用于定位异常的优化目标energy为$E = L{reconstruction} + \left| x_n - x_a \right|1 $,上式中的$x_n$等同于下面表述的$x{old}$。 
- 如何迭代式的将$x_a$转为$x_n$?定义梯度下降操作$x_{new} = x_{old} - \alpha * \nabla _x (E)$ 
- 由于仅需要对异常的像素进行更新,所以在上述公式上加一层Mask,即$x_{new} = x_{old} - \alpha * (\nabla x (E) \odot Mask)$,但是在无监督的情况下,$Mask$是未知的,因此使用像素级的重建误差来代替。则样本更新公式为$x{new} = x_{old} - \alpha * (\nabla x (E) \odot (x{old} - f_{VAE}(x_{old}))^2)$ 
 - Optimizing the energy this way, a pixel where the reconstruction error is high will update faster, whereas a pixel with good reconstruction 
 will not change easily. This prevents the image to update its pixels where the reconstruction is already good, even with a high learning rate.
- 已读 Robust anomaly detection and backdoor attack detection via differential privacy - Min Du, Ruoxi Jia, Dawn Song 
- Classification-Based Anomaly Detection for General Data General Data: 多种数据类型,图像、xxx等等 - Liron Bergman, Yedid Hoshen 
- 已读 Robust Subspace Recovery Layer for Unsupervised Anomaly Detection - Chieh-Hsin Lai, Dongmian Zou, Gilad Lerman 
- Attention Guided Anomaly Detection and Localization in ImagesArXiv2019 - Shashanka VenkataramananKuan-Chuan PengRajat Vikram SinghAbhijit Mahalanobis 
Sequence
- Are Transformers universal approximators of sequence-to-sequence functions? - Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar 
- DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling 
- Revisiting Self-Training for Neural Sequence Generation - Junxian He, Jiatao Gu, Jiajun Shen, Marc’Aurelio Ranzato 
- Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation - Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou 
- Compressive Transformers for Long-Range Sequence Modelling - Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Chloe Hillier, Timothy P. Lillicrap 
- Model-based reinforcement learning for biological sequence design - Christof Angermueller, David Dohan, David Belanger, Ramya Deshpande, Kevin Murphy, Lucy Colwell 
- Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models - Xisen Jin, Zhongyu Wei, Junyi Du, Xiangyang Xue, Xiang Ren 
Time Series
- 已读 N-BEATS: Neural basis expansion analysis for interpretable time series forecasting 可解释性上的工作 - Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio 
- Dynamic Time Lag Regression: Predicting What & When - Mandar Chandorkar, Cyril Furtlehner, Bala Poduval, Enrico Camporeale, Michele Sebag 
- Intensity-Free Learning of Temporal Point Processes - Oleksandr Shchur, Marin Biloš, Stephan Günnemann 
Recurrent
- Variational Recurrent Models for Solving Partially Observable Control Tasks - Dongqi Han, Kenji Doya, Jun Tani 
- One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation - Shunshi Zhang, Bradly C. Stadie 
- Recurrent neural circuits for contour detection - Drew Linsley*, Junkyung Kim*, Alekh Ashok, Thomas Serre 
- Improved memory in recurrent neural networks with sequential non-normal dynamics 是否可以使用non-normal来描述我们的工作中的系统 - Emin Orhan, Xaq Pitkow 
- Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality - Saurabh Khanna, Vincent Y. F. Tan 
- Decoding As Dynamic Programming For Recurrent Autoregressive Models - Najam Zaidi, Trevor Cohn, Gholamreza Haffari 
- Training Recurrent Neural Networks Online by Learning Explicit State Variables - Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White 
- Understanding Generalization in Recurrent Neural Networks - Zhuozhuo Tu, Fengxiang He, Dacheng Tao 
- RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients? - Anil Kag, Ziming Zhang, Venkatesh Saligrama 
- Implementing Inductive bias for different navigation tasks through diverse RNN attrractors - Tie XU, Omri Barak 
- Symplectic Recurrent Neural Networks - Zhengdao Chen, Jianyu Zhang, Martin Arjovsky, Léon Bottou 
Interpretable
- Overlearning Reveals Sensitive Attributes - Congzheng Song, Vitaly Shmatikov 
- Explain Your Move: Understanding Agent Actions Using Specific and Relevant Feature Attribution - Nikaash Puri, Sukriti Verma, Piyush Gupta, Dhruv Kayastha, Shripad Deshmukh, Balaji Krishnamurthy, Sameer Singh 
Autoencoder
- 已读 From Variational to Deterministic Autoencoders - Partha Ghosh, Mehdi S. M. Sajjadi, Antonio Vergari, Michael Black, Bernhard Scholkopf 
- MIXED-CURVATURE VARIATIONAL AUTOENCODERS 
- Mogrifier LSTM - Gábor Melis, Tomáš Kočiský, Phil Blunsom 
Data augmentation
- ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring - David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, Colin Raffel