MLSP 2020

IEEE International Workshop on

September 21–24, 2020 Aalto University, Espoo, Finland (virtual conference)

Wednesday, September 23, 2020


Lecture Session 4: Deep Learning for Inverse Problems

Chair: Samuli Siltanen, University of Helsinki


A hybrid interior point - deep learning approach for poisson image deblurring
Mathilde Galinier, Marco Prato, Emilie Chouzenoux, Jean-Christophe Pesquet

In this paper we address the problem of deconvolution of an image corrupted with Poisson noise by reformulating the restoration process as a constrained minimization of a suitable regularized data fidelity function. The minimization step is performed by means of an interior-point approach, in which the constraints are incorporated within the objective function through a barrier penalty and a forward-backward algorithm is exploited to build a minimizing sequence. The key point of our proposed scheme is that the choice of the regularization, barrier and step-size parameters defining the interior point approach is automatically performed by a deep learning strategy. Numerical tests on Poisson corrupted benchmark datasets show that our method can obtain very good performance when compared to a state-of-the-art variational deblurring strategy.


Real-time segmentation for tomographic imaging
Richard A Schoonhoven, Jan-Willem Buurlage, Daniël Pelt, Kees Joost Batenburg

In tomography, reconstruction and analysis is often performed once the acquisition has been completed due to the computational cost of the 3D imaging algorithms. In contrast, real-time reconstruction and analysis can avoid costly repetition of experiments and enable optimization of experimental parameters. Recently, it was shown that by reconstructing a subset of arbitrarily oriented slices, real-time quasi-3D reconstruction can be attained. Here, we extend this approach by including real-time segmentation, thereby enabling real-time analysis during the experiment. We propose to use a convolutional neural network (CNN) to perform real-time image segmentation and introduce an adapted training strategy in order to apply CNNs to arbitrarily oriented slices. We evaluate our method on both simulated and real-world data. The experiments show that our approach enables real-time tomographic segmentation for real-world applications and outperforms standard unsupervised segmentation methods.


Electrical impedance tomography, enclosure method \and machine learning
Samuli Siltanen, Takanori Ide

Electrical impedance tomography (EIT) is a non-destructive imaging method, where a physical body is probed with electric measurements at the boundary, and information about the internal conductivity is extracted from the data. The enclosure method of Ikehata [J. Inv. Ill-Posed Prob. 8(2000)] recovers the convex hull of an inclusion of unknown conductivity embedded in known background conductivity. Practical implementations of the enclosure method are based on least-squares (LS) fitting of lines to noise-robust values of the so-called indicator function. It is shown how a convolutional neural network instead of LS fitting improves the accuracy of the enclosure method significantly while retaining interpretability.


Blind hierarchical deconvolution
Arttu Arjas, Lassi Roininen, Mikko Sillanpää, Andreas Hauptmann

Deconvolution is a fundamental inverse problem in signal processing and the prototypical model for recovering a signal from its noisy measurement. Nevertheless, the majority of model-based inversion techniques require knowledge on the convolution kernel to recover an accurate reconstruction and additionally prior assumptions on the regularity of the signal are needed. To overcome these limitations, we parametrise the convolution kernel and prior length-scales, which are then jointly estimated in the inversion procedure. The proposed framework of blind hierarchical deconvolution enables accurate reconstructions of functions with varying regularity and unknown kernel size and can be solved efficiently with an empirical Bayes two-step procedure, where hyperparameters are first estimated by optimisation and other unknowns then by an analytical formula.


Anomaly location detection with electrical impedance tomography using multilayer perceptrons
Timo A Huuhtanen, Alexander Jung

Electrical impedance tomography (EIT) does imaging by solving a nonlinear ill-posed inverse problem. Recently, there has been an increasing interest in solving this problem with artificial neural networks. However, a systematic understanding of the optimal neural network architecture for this problem is still lacking. This paper compares the performance of different multilayer perceptron algorithms for detecting the location of an anomaly on a sensing surface by solving the EIT inverse problem. We generate synthetic data with varying anomaly sizes/locations and compare a wide range of multilayer perceptron algorithms by simulations. Our results indicate that increasing the dimensions of the perceptron improves performance, but this improvement saturates soon. The best performance is achieved when using the multilayer perceptron for regression and Gaussian noise addition as the regularization method.


iUNets: Learnable invertible up- and downsampling for large-scale inverse problems
Christian Etmann, Rihuan Ke, Carola-Bibiane Schönlieb

U-Nets have been established as a standard neural network architecture for image-to-image problems such as segmentation and inverse problems in imaging. For high-dimensional applications, as they for example appear in 3D medical imaging, U-Nets however have prohibitive memory requirements. Here, we present a new fully-invertible U-Net-based architecture called the iUNet, which allows for the application of highly memory-efficient backpropagation procedures. As its main building block, we introduce learnable and invertible up- an downsampling operations. For this, we developed an open-source implementation in Pytorch for 1D, 2D and 3D data.

Lecture Session 5: Learning Theory, Modeling, and Graphical Models

Chair: Alexander Ilin, Aalto University


Deep blind one-bit signal recovery
Rui Da Huang, Xu Chen

In this paper, we consider blind one-bit signal recovery, where the linear measurements of the signal are quantized to single bits and the signal is to be recovered without knowledge of the measurement matrix, noise statistics and the quantization thresholds. We propose two deep learning based methods, one based on a multi-layer perceptron architecture and the other based on the long short-term memory architecture. The two neural architectures are inspired by the recurrent calculation of the gradient descent method to solve the maximum likelihood detection. The performance of the proposed schemes is compared with a recently proposed deep learning based signal recovery framework. Experiments show that our proposed blind one-bit signal recovery schemes achieve comparable signal reconstruction performance with a much lower complexity even without the knowledge of the measurement matrix.


SCAD-penalized complex Gaussian graphical model selection
Jitendra K Tugnait

We consider the problem of estimating the conditional independence graph (CIG) of a sparse, high-dimensional proper complex-valued Gaussian graphical model (CGGM). For CGGMs, the problem reduces to estimation of the inverse covariance matrix with more unknowns than the sample size. We consider a smoothly clipped absolute deviation (SCAD) penalty instead of the \(\ell_1\)-penalty to regularize the problem, and analyze a SCAD-penalized log-likelihood based objective function to establish consistency and sparsistency of a local estimator of inverse covariance in a neighborhood of the true value. A numerical example is presented to illustrate the advantage of SCAD-penalty over the usual \(\ell_1\)-penalty


On the adversarial robustness of feature selection using LASSO
Fuwei Li, Lifeng Lai, Shuguang (Robert) Cui

In this paper, we investigate the adversarial robustness of feature selection based on the \(\ell_1\) regularized linear regression method, named LASSO. In the considered problem, there is an adversary who can observe the whole data set. After seeing the data, the adversary will carefully modify the response values and the feature matrix in order to manipulate the selected features. We formulate this problem as a bi-level optimization problem and cast the \(\ell_1\) regularized linear regression problem as a linear inequality constrained quadratic programming problem to mitigate the issue caused by non-differentiability of the \(\ell_1\) norm. We then use the projected gradient descent to design the modification strategy. Numerical examples based on synthetic data and real data both indicate that the feature selection is very vulnerable to this kind of attacks.


On the minimization of Sobolev norms of time-varying graph signals: Estimation of new coronavirus disease 2019 cases
Jhony H. Giraldo, Thierry Bouwmans

The mathematical modeling of infectious diseases is a fundamental research field for the planning of strategies to contain outbreaks. The models associated with this field of study usually have exponential prior assumptions in the number of new cases, while the exploration of spatial data has been little analyzed in these models. In this paper, we model the number of new cases of the Coronavirus Disease 2019 (COVID-19) as a problem of reconstruction of time-varying graph signals. To this end, we proposed a new method based on the minimization of the Sobolev norm in graph signal processing. Our method outperforms state-of-the-art algorithms in two COVID-19 databases provided by Johns Hopkins University. In the same way, we prove the benefits of the convergence rate of the Sobolev reconstruction method by relying on the condition number of the Hessian associated with the underlying optimization problem of our method.


Fixing bias in reconstruction-based anomaly detection with Lipschitz discriminators
Alexander Y Tong, Guy Wolf, Smita Krishnaswamy

Anomaly detection is of great interest in fields where abnormalities need to be identified and corrected (e.g., medicine and finance). Deep learning methods for this task often rely on autoencoder reconstruction error, sometimes in conjunction with other errors. We show that this approach exhibits intrinsic biases that lead to undesirable results. Reconstruction-based methods are sensitive to training-data outliers and simple-to-reconstruct points. Instead, we introduce a new unsupervised Lipschitz anomaly discriminator that does not suffer from these biases. Our anomaly discriminator is trained, similar to the ones used in GANs, to detect the difference between the training data and corruptions of the training data. We show that this procedure successfully detects unseen anomalies with guarantees on those that have a certain Wasserstein distance from the data or corrupted training set. These additions allow us to show improved performance on MNIST, CIFAR10, and health record data.


First-order optimization for superquantile-based supervised learning
Yassine Laguel, Jérôme Malick, Zaid Harchaoui

Classical supervised learning via empirical risk (or negative log-likelihood) minimization hinges upon the assumption that the testing distribution coincides with the training distribution. This assumption can be challenged in modern applications of machine learning in which learning machines may operate at prediction time with testing data whose distribution departs from the one of the training data. We revisit the superquantile approach proposed by Rockafellar and present a first-order optimization algorithm based on smoothing by infimal convolution to minimize a superquantile-based objective for safer supervised learning. Promising numerical results illustrate the interest of the approach.

Poster Session 3: Learning Theory and Bayesian Modeling

Chair: Manon Kok, Delft University of Technology


PPO-CMA: Proximal policy optimization with covariance matrix adaptation
Perttu Hämäläinen, Amin Babadi, Xiaoxiao Ma, Jaakko Lehtinen

Proximal Policy Optimization (PPO) is a highly popular model-free reinforcement learning (RL) approach. However, we observe that in a continuous action space, PPO can prematurely shrink the exploration variance, which leads to slow progress and may make the algorithm prone to getting stuck in local optima. Drawing inspiration from CMA-ES, a black-box evolutionary optimization method designed for robustness in similar situations, we propose PPO-CMA, a proximal policy optimization approach that adaptively expands the exploration variance to speed up progress. With only minor changes to PPO, our algorithm considerably improves performance in Roboschool continuous control benchmarks. Our results also show that PPO-CMA, as opposed to PPO, is significantly less sensitive to the choice of hyperparameters, allowing one to use it in complex movement optimization tasks without requiring tedious tuning.


Approximate Gaussian process regression and performance analysis using composite likelihood
Xiuming Liu, Dave Zachariah, Edith Ngai

Nonparametric regression using Gaussian Process (GP) models is a powerful but computationally demanding method. While various approximation methods have been developed to mitigate its computation complexity, few works have addressed the quality of the resulting approximations of the target posterior. In this paper we start from a general belief updating framework that can generate various approximations. We show that applying using composite likelihoods yields computationally scalable approximations for both GP learning and prediction. We then analyze the quality of the approximation in terms of averaged prediction errors as well as Kullback-Leibler (KL) divergences.


Sequential heterogeneous feature selection for multi-class classification: Application in Government 2.0
Imara Mohamed Nazar, Yasitha Warahena Liyanage, Daphney-Stavroula Zois, Charalampos Chelmis

Herein, the problem of multi-class classification of participatory civil issue reports in crowdsourcing platforms is addressed. Specifically, an efficient method is proposed to guide the selection of heterogeneous features, so as to account for different information facets of the reported issue. An optimization framework is devised to select the minimum number of informative features from each feature set, and switch between feature sets when deemed necessary to achieve an accurate classification decision. Evaluation on real-world data from SeeClickFix, a government 2.0 platform, shows the ability of the proposed framework to classify civil issue reports with up to 92.6% accuracy, while using 99.82% less features than the state-of-the-art.


On the adversarial robustness of linear regression
Fuwei Li, Lifeng Lai, Shuguang (Robert) Cui

In this paper, we study the adversarial robustness of linear regression problems. Specifically, we investigate the robustness of the regression coefficients against adversarial data samples. In the considered model, there exists an adversary who is able to add one carefully designed adversarial data sample into the dataset. By leveraging this poisoned data sample, the adversary tries to boost or depress the magnitude of one targeted regression coefficient under the energy constraint of the adversarial data sample. We characterize the exact expression of the optimal adversarial data sample in terms of the targeted regression coefficient, the original dataset and the energy budget. Our experiments with synthetic and real datasets show the efficiency and optimality of our proposed adversarial strategy.


Non-linearities in Gaussian processes with integral observations
Ville Tanskanen, Krista Longi, Arto Klami

Gaussian processes (GP) can be used for inferring latent continuous functions also based on aggregate observations corresponding to integrals of the function, for example to learn daily rate of new infections in a population based on cumulative observations collected only weekly. We extend these approaches to cases where the observations correspond to aggregates of arbitrary non-linear transformations of a GP. Such models are needed, for example, when the latent function of interest is known to be non-negative or bounded. We present a solution based on Markov chain Monte Carlo with numerical integration for aggregation, and demonstrate it in binned Poisson regression and in non-invasive detection of fouling using ultrasound waves.


Exact O(n2) hyper-parameter optimization for Gaussian process regression
Linning Xu, Yijue Dai, Jiawei Zhang, Ceyao Zhang, Feng Yin

Hyper-parameter optimization remains as the core issue of Gaussian process (GP) for machine learning nowadays. The benchmark method using maximum likelihood (ML) estimation and gradient descent (GD) is impractical for processing big data due to its O(n3) complexity. Many sophisticated global or local approximation models have been proposed to address such complexity issue. In this paper, we propose two novel and exact GP hyper-parameter training schemes by replacing ML with cross-validation (CV) as the fitting criterion and replacing GD with a non-linearly constrained alternating direction method of multipliers (ADMM) as the optimization method. The proposed schemes are of O(n2) complexity for any covariance matrix without special structure. We conduct experiments based on synthetic and real datasets, wherein the proposed schemes show excellent performance in terms of convergence, hyper-parameter estimation, and computational time in comparison with the traditional ML based routines.


Diffusion field estimation using decentralized kernel Kalman filter with parameter learning over hierarchical sensor networks
Shengdi Wang, Ban-Sok Shin, Dmitriy Shutin, Armin Dekorsy

In this paper, a task is addressed to track a nonlinear time-varying diffusion field based on data collected by sensor networks. By exploiting kernel methods, the nonlinear field function is approximated by a linear combination of kernel functions in a reproducing kernel Hilbert space (RKHS). To capture the dynamical property of a diffusion field and the relation of system input and output data, a state-space model on weights of these kernel functions is constructed with unknown process noise. Thus, the nonlinear tracking problem is transformed into a linear state estimation solved by Kalman filter. Further, this kernel Kalman filter (KKF) is decomposed into a decentralized fashion in a way to collect sensor data efficiently over a hierarchical network structure with different clusters. To adapt the algorithm to unknown process noise, a decentralized variational Bayesian KKF is proposed to learn the distributions of system unknown variables.


Fast variational learning in state-space Gaussian process models
Paul Chang, William Wilkinson, Mohammad Emtiyaz Khan, Arno Solin

Gaussian process (GP) regression with 1D inputs can often be performed in linear time via a stochastic differential equation formulation. However, for non-Gaussian likelihoods, this requires application of approximate inference methods which can make the implementation difficult, e.g., expectation propagation can be numerically unstable and variational inference can be computationally inefficient. In this paper, we propose a new method that removes such difficulties. Building upon an existing method called conjugate-computation variational inference, our approach enables linear-time inference via Kalman recursions while avoiding numerical instabilities and convergence issues. We provide an efficient JAX implementation which exploits just-in-time compilation and allows for fast automatic differentiation through large for-loops. Overall, our approach leads to fast and stable variational inference in state-space GP models that can be scaled to time series with millions of data points.


Collision-free UAV navigation with a monocular camera using deep reinforcement learning
Yun Chen, Nuria Gonzalez-Prelcic, Robert W Heath

Small unmanned aerial vehicles (UAV) with reduced sensing and communication capabilities can support potential use cases in different indoor environments such as automated factories or commercial buildings. In this context, we consider the problem of collision-free autonomous UAV navigation supported by a simple sensor. We propose a navigation system based on object detection and deep reinforcement learning (DRL) that only exploits sensing data obtained by a monocular camera mounted on the UAV. Object detection is incorporated into DRL training to reduce flight time and to maximize the probability of avoiding both current and future crashes. Moreover, object detection also helps to remove the impact of wrong predictions from the deep network. When compared to schemes using traditional RL methods, the proposed framework not only leads to collision-free trips, but it also reduces flying times towards given destinations by 25%, and cuts down 50% of unnecessary turns.


Aerial spectrum surveying: Radio map estimation with autonomous UAVs
Daniel Romero, Raju Shrestha, Yves Teganya, Sundeep Chepuri

Radio maps are emerging as a popular means to endow next-generation wireless communications with situational awareness. In particular, radio maps are expected to play a central role in unmanned aerial vehicle (UAV) communications since they can be used to determine interference or channel gain at a spatial location where a UAV has not been before. Existing methods for radio map estimation utilize measurements collected by sensors whose locations cannot be controlled. In contrast, this paper proposes a scheme in which a UAV collects measurements along a trajectory. This trajectory is designed to obtain accurate estimates of the target radio map in a short time operation. The route planning algorithm relies on a map uncertainty metric to collect measurements at those locations where they are more informative. An online Bayesian learning algorithm is developed to update the map estimate and uncertainty metric every time a new measurement is collected, which enables real-time operation.