Résumé
The total variation (TV) penalty, as many other analysis-sparsity problems, does not lead to separable factors or a proximal operator with a closed-form expression, such as soft thresholding for the l1 penalty. As a result, in a variational formulation of an inverse problem or statistical learning estimation, it leads to challenging non-smooth optimization problems that are often solved with elaborate single-step first-order methods. When the data-fit term arises from empirical measurements, as in brain imaging, it is often very ill-conditioned and without simple structure. In this situation, in proximal splitting methods, the computation cost of the gradient step can easily dominate each iteration. Thus it is beneficial to minimize the number of gradient steps. We present fAASTA, a variant of FISTA, that relies on an internal solver for the TV proximal operator, and refines its tolerance to balance computational cost of the gradient and the proximal steps. We give benchmarks and illustrations on “brain decoding”: recovering brain maps from noisy measurements to predict observed behavior. The algorithm as well as the empirical study of convergence speed are valuable for any non-exact proximal operator, in particular analysis-sparsity problems.