argmin/lib.rs
1// Copyright 2018-2020 argmin developers
2//
3// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
4// http://apache.org/licenses/LICENSE-2.0> or the MIT license <LICENSE-MIT or
5// http://opensource.org/licenses/MIT>, at your option. This file may not be
6// copied, modified, or distributed except according to those terms.
7
8//! A pure Rust optimization framework
9//!
10//! This crate offers a (work in progress) numerical optimization toolbox/framework written entirely
11//! in Rust. It is at the moment potentially very buggy. Please use with care and report any bugs
12//! you encounter. This crate is looking for contributors!
13//!
14//! [Documentation of most recent release](https://docs.rs/argmin/latest/argmin/)
15//!
16//! [Documentation of master](https://argmin-rs.github.io/argmin/argmin/)
17//!
18//! # Design goals
19//!
20//! This crate's intention is to be useful to users as well as developers of optimization
21//! algorithms, meaning that it should be both easy to apply and easy to implement algorithms. In
22//! particular, as a developer of optimization algorithms you should not need to worry about
23//! usability features (such as logging, dealing with different types, setters and getters for
24//! certain common parameters, counting cost function and gradient evaluations, termination, and so
25//! on). Instead you can focus on implementing your algorithm.
26//!
27//! - Easy framework for the implementation of optimization algorithms: Implement a single iteration
28//! of your method and let the framework do the rest. This leads to similar interfaces for
29//! different solvers, making it easy for users.
30//! - Pure Rust implementations of a wide range of optimization methods: This avoids the need to
31//! compile and interface C/C++/Fortran code.
32//! - Type-agnostic: Many problems require data structures that go beyond simple vectors to
33//! represent the parameters. In argmin, everything is generic: All that needs to be done is
34//! implementing certain traits on your data type. For common types, these traits are already
35//! implemented.
36//! - Convenient: Easy and consistent logging of anything that may be important. Log to the
37//! terminal, to a file or implement your own observers. Future plans include sending metrics to
38//! databases and connecting to big data piplines.
39//! - Algorithm evaluation: Methods to assess the performance of an algorithm for different
40//! parameter settings, problem classes, ...
41//!
42//! Since this crate is in a very early stage, so far most points are only partially implemented or
43//! remain future plans.
44//!
45//! # Algorithms
46//!
47//! - [Line searches](solver/linesearch/index.html)
48//! - [Backtracking line search](solver/linesearch/backtracking/struct.BacktrackingLineSearch.html)
49//! - [More-Thuente line search](solver/linesearch/morethuente/struct.MoreThuenteLineSearch.html)
50//! - [Hager-Zhang line search](solver/linesearch/hagerzhang/struct.HagerZhangLineSearch.html)
51//! - [Trust region method](solver/trustregion/trustregion_method/struct.TrustRegion.html)
52//! - [Cauchy point method](solver/trustregion/cauchypoint/struct.CauchyPoint.html)
53//! - [Dogleg method](solver/trustregion/dogleg/struct.Dogleg.html)
54//! - [Steihaug method](solver/trustregion/steihaug/struct.Steihaug.html)
55//! - [Steepest descent](solver/gradientdescent/steepestdescent/struct.SteepestDescent.html)
56//! - [Conjugate gradient method](solver/conjugategradient/cg/struct.ConjugateGradient.html)
57//! - [Nonlinear conjugate gradient method](solver/conjugategradient/nonlinear_cg/struct.NonlinearConjugateGradient.html)
58//! - [Newton methods](solver/newton/index.html)
59//! - [Newton's method](solver/newton/newton_method/struct.Newton.html)
60//! - [Newton-CG](solver/newton/newton_cg/struct.NewtonCG.html)
61//! - [Quasi-Newton methods](solver/quasinewton/index.html)
62//! - [BFGS](solver/quasinewton/bfgs/struct.BFGS.html)
63//! - [L-BFGS](solver/quasinewton/lbfgs/struct.LBFGS.html)
64//! - [DFP](solver/quasinewton/dfp/struct.DFP.html)
65//! - [SR1](solver/quasinewton/sr1/struct.SR1.html)
66//! - [SR1-TrustRegion](solver/quasinewton/sr1_trustregion/struct.SR1TrustRegion.html)
67//! - [Gauss-Newton method](solver/gaussnewton/gaussnewton/struct.GaussNewton.html)
68//! - [Gauss-Newton method with linesearch](solver/gaussnewton/gaussnewton_linesearch/struct.GaussNewtonLS.html)
69//! - [Golden-section search](solver/goldensectionsearch/struct.GoldenSectionSearch.html)
70//! - [Landweber iteration](solver/landweber/struct.Landweber.html)
71//! - [Brent's method](solver/brent/struct.Brent.html)
72//! - [Nelder-Mead method](solver/neldermead/struct.NelderMead.html)
73//! - [Simulated Annealing](solver/simulatedannealing/struct.SimulatedAnnealing.html)
74//! - [Particle Swarm Optimization](solver/particleswarm/struct.ParticleSwarm.html)
75//!
76//! # Usage
77//!
78//! Add this to your `Cargo.toml`:
79//!
80//! ```toml
81//! [dependencies]
82//! argmin = "0.3.1"
83//! ```
84//!
85//! ## Optional features (recommended)
86//!
87//! There are additional features which can be activated in `Cargo.toml`:
88//!
89//! ```toml
90//! [dependencies]
91//! argmin = { version = "0.3.1", features = ["ctrlc", "ndarrayl"] }
92//! ```
93//!
94//! These may become default features in the future. Without these features compilation to
95//! `wasm32-unknown-unkown` seems to be possible.
96//!
97//! - `ctrlc`: Uses the `ctrlc` crate to properly stop the optimization (and return the current best
98//! result) after pressing Ctrl+C.
99//! - `ndarrayl`: Support for `ndarray`, `ndarray-linalg` and `ndarray-rand`.
100//!
101//! ## Running the tests
102//!
103//! Running the tests requires the `ndarrayl` feature to be enabled:
104//!
105//! ```bash
106//! cargo test --features "ndarrayl"
107//! ```
108//!
109//! # Defining a problem
110//!
111//! A problem can be defined by implementing the `ArgminOp` trait which comes with the
112//! associated types `Param`, `Output` and `Hessian`. `Param` is the type of your
113//! parameter vector (i.e. the input to your cost function), `Output` is the type returned
114//! by the cost function, `Hessian` is the type of the Hessian and `Jacobian` is the type of the
115//! Jacobian.
116//! The trait provides the following methods:
117//!
118//! - `apply(&self, p: &Self::Param) -> Result<Self::Output, Error>`: Applys the cost
119//! function to parameters `p` of type `Self::Param` and returns the cost function value.
120//! - `gradient(&self, p: &Self::Param) -> Result<Self::Param, Error>`: Computes the
121//! gradient at `p`.
122//! - `hessian(&self, p: &Self::Param) -> Result<Self::Hessian, Error>`: Computes the Hessian
123//! at `p`.
124//! - `jacobian(&self, p: &Self::Param) -> Result<Self::Jacobian, Error>`: Computes the Jacobian
125//! at `p`.
126//!
127//! The following code snippet shows an example of how to use the Rosenbrock test functions from
128//! `argmin-testfunctions` in argmin:
129//!
130//! ```rust
131//! # extern crate argmin;
132//! # extern crate argmin_testfunctions;
133//! # extern crate ndarray;
134//! use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative, rosenbrock_2d_hessian};
135//! use argmin::prelude::*;
136//!
137//! /// First, create a struct for your problem
138//! struct Rosenbrock {
139//! a: f64,
140//! b: f64,
141//! }
142//!
143//! /// Implement `ArgminOp` for `Rosenbrock`
144//! impl ArgminOp for Rosenbrock {
145//! /// Type of the parameter vector
146//! type Param = Vec<f64>;
147//! /// Type of the return value computed by the cost function
148//! type Output = f64;
149//! /// Type of the Hessian. Can be `()` if not needed.
150//! type Hessian = Vec<Vec<f64>>;
151//! /// Type of the Jacobian. Can be `()` if not needed.
152//! type Jacobian = ();
153//! /// Floating point precision
154//! type Float = f64;
155//!
156//! /// Apply the cost function to a parameter `p`
157//! fn apply(&self, p: &Self::Param) -> Result<Self::Output, Error> {
158//! Ok(rosenbrock_2d(p, self.a, self.b))
159//! }
160//!
161//! /// Compute the gradient at parameter `p`.
162//! fn gradient(&self, p: &Self::Param) -> Result<Self::Param, Error> {
163//! Ok(rosenbrock_2d_derivative(p, self.a, self.b))
164//! }
165//!
166//! /// Compute the Hessian at parameter `p`.
167//! fn hessian(&self, p: &Self::Param) -> Result<Self::Hessian, Error> {
168//! let t = rosenbrock_2d_hessian(p, self.a, self.b);
169//! Ok(vec![vec![t[0], t[1]], vec![t[2], t[3]]])
170//! }
171//! }
172//! ```
173//!
174//! It is optional to implement any of these methods, as there are default implementations which
175//! will return an `Err` when called. What needs to be implemented is defined by the requirements
176//! of the solver that is to be used.
177//!
178//! # Running a solver
179//!
180//! The following example shows how to use the previously shown definition of a problem in a
181//! Steepest Descent (Gradient Descent) solver.
182//!
183//! ```rust
184//! # #![allow(unused_imports)]
185//! # extern crate argmin;
186//! # extern crate argmin_testfunctions;
187//! use argmin::prelude::*;
188//! use argmin::solver::gradientdescent::SteepestDescent;
189//! use argmin::solver::linesearch::MoreThuenteLineSearch;
190//! # use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative};
191//! #
192//! # struct Rosenbrock {
193//! # a: f64,
194//! # b: f64,
195//! # }
196//! #
197//! # impl ArgminOp for Rosenbrock {
198//! # type Param = Vec<f64>;
199//! # type Output = f64;
200//! # type Hessian = ();
201//! # type Jacobian = ();
202//! # type Float = f64;
203//! #
204//! # fn apply(&self, p: &Self::Param) -> Result<Self::Output, Error> {
205//! # Ok(rosenbrock_2d(p, self.a, self.b))
206//! # }
207//! #
208//! # fn gradient(&self, p: &Self::Param) -> Result<Self::Param, Error> {
209//! # Ok(rosenbrock_2d_derivative(p, self.a, self.b))
210//! # }
211//! # }
212//! #
213//! # fn run() -> Result<(), Error> {
214//!
215//! // Define cost function (must implement `ArgminOperator`)
216//! let cost = Rosenbrock { a: 1.0, b: 100.0 };
217//!
218//! // Define initial parameter vector
219//! let init_param: Vec<f64> = vec![-1.2, 1.0];
220//!
221//! // Set up line search
222//! let linesearch = MoreThuenteLineSearch::new();
223//!
224//! // Set up solver
225//! let solver = SteepestDescent::new(linesearch);
226//!
227//! // Run solver
228//! let res = Executor::new(cost, solver, init_param)
229//! // Add an observer which will log all iterations to the terminal
230//! .add_observer(ArgminSlogLogger::term(), ObserverMode::Always)
231//! // Set maximum iterations to 10
232//! .max_iters(10)
233//! // run the solver on the defined problem
234//! .run()?;
235//! #
236//! # // Wait a second (lets the logger flush everything first)
237//! # std::thread::sleep(std::time::Duration::from_secs(1));
238//!
239//! // print result
240//! println!("{}", res);
241//! # Ok(())
242//! # }
243//! #
244//! # fn main() {
245//! # if let Err(ref e) = run() {
246//! # println!("{}", e);
247//! # std::process::exit(1);
248//! # }
249//! # }
250//! ```
251//!
252//! # Observing iterations
253//!
254//! Argmin offers an interface to observe the state of the iteration at initialization as well as
255//! after every iteration. This includes the parameter vector, gradient, Hessian, iteration number,
256//! cost values and many more as well as solver-specific metrics. This interface can be used to
257//! implement loggers, send the information to a storage or to plot metrics.
258//! Observers need to implment the `Observe` trait.
259//! Argmin ships with a logger based on the `slog` crate. `ArgminSlogLogger::term` logs to the
260//! terminal and `ArgminSlogLogger::file` logs to a file in JSON format. Both loggers also come
261//! with a `*_noblock` version which does not block the execution of logging, but may drop some
262//! messages if the buffer is full.
263//! Parameter vectors can be written to disc using `WriteToFile`.
264//! For each observer it can be defined how often it will observe the progress of the solver. This
265//! is indicated via the enum `ObserverMode` which can be either `Always`, `Never`, `NewBest`
266//! (whenever a new best solution is found) or `Every(i)` which means every `i`th iteration.
267//!
268//! ```rust
269//! # #![allow(unused_imports)]
270//! # extern crate argmin;
271//! # extern crate argmin_testfunctions;
272//! # use argmin::prelude::*;
273//! # use argmin::solver::gradientdescent::SteepestDescent;
274//! # use argmin::solver::linesearch::MoreThuenteLineSearch;
275//! # use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative};
276//! #
277//! # struct Rosenbrock {
278//! # a: f64,
279//! # b: f64,
280//! # }
281//! #
282//! # impl ArgminOp for Rosenbrock {
283//! # type Param = Vec<f64>;
284//! # type Output = f64;
285//! # type Hessian = ();
286//! # type Jacobian = ();
287//! # type Float = f64;
288//! #
289//! # fn apply(&self, p: &Self::Param) -> Result<Self::Output, Error> {
290//! # Ok(rosenbrock_2d(p, self.a, self.b))
291//! # }
292//! #
293//! # fn gradient(&self, p: &Self::Param) -> Result<Self::Param, Error> {
294//! # Ok(rosenbrock_2d_derivative(p, self.a, self.b))
295//! # }
296//! # }
297//! #
298//! # fn run() -> Result<(), Error> {
299//! #
300//! # // Define cost function (must implement `ArgminOperator`)
301//! # let problem = Rosenbrock { a: 1.0, b: 100.0 };
302//! #
303//! # // Define initial parameter vector
304//! # let init_param: Vec<f64> = vec![-1.2, 1.0];
305//! #
306//! # // Set up line search
307//! # let linesearch = MoreThuenteLineSearch::new();
308//! #
309//! # // Set up solver
310//! # let solver = SteepestDescent::new(linesearch);
311//! #
312//! let res = Executor::new(problem, solver, init_param)
313//! // Add an observer which will log all iterations to the terminal (without blocking)
314//! .add_observer(ArgminSlogLogger::term_noblock(), ObserverMode::Always)
315//! // Log to file whenever a new best solution is found
316//! .add_observer(ArgminSlogLogger::file("solver.log", false)?, ObserverMode::NewBest)
317//! // Write parameter vector to `params/param.arg` every 20th iteration
318//! .add_observer(WriteToFile::new("params", "param"), ObserverMode::Every(20))
319//! # .max_iters(2)
320//! // run the solver on the defined problem
321//! .run()?;
322//! # Ok(())
323//! # }
324//! #
325//! # fn main() {
326//! # if let Err(ref e) = run() {
327//! # println!("{}", e);
328//! # std::process::exit(1);
329//! # }
330//! # }
331//! ```
332//!
333//! # Checkpoints
334//!
335//! The probability of crashes increases with runtime, therefore one may want to save checkpoints
336//! in order to be able to resume the optimization after a crash.
337//! The `CheckpointMode` defines how often checkpoints are saved and is either `Never` (default),
338//! `Always` (every iteration) or `Every(u64)` (every Nth iteration). It is set via the setter
339//! method `checkpoint_mode` of `Executor`.
340//! In addition, the directory where the checkpoints and a prefix for every file can be set via
341//! `checkpoint_dir` and `checkpoint_name`, respectively.
342//!
343//! The following example shows how the `from_checkpoint` method can be used to resume from a
344//! checkpoint. In case this fails (for instance because the file does not exist, which could mean
345//! that this is the first run and there is nothing to resume from), it will resort to creating a
346//! new `Executor`, thus starting from scratch.
347//!
348//! ```rust
349//! # extern crate argmin;
350//! # extern crate argmin_testfunctions;
351//! # use argmin::prelude::*;
352//! # use argmin::solver::landweber::*;
353//! # use argmin_testfunctions::{rosenbrock_2d, rosenbrock_2d_derivative};
354//! # use argmin::core::Error;
355//! #
356//! # #[derive(Default)]
357//! # struct Rosenbrock {}
358//! #
359//! # impl ArgminOp for Rosenbrock {
360//! # type Param = Vec<f64>;
361//! # type Output = f64;
362//! # type Hessian = ();
363//! # type Jacobian = ();
364//! # type Float = f64;
365//! #
366//! # fn apply(&self, p: &Vec<f64>) -> Result<f64, Error> {
367//! # Ok(rosenbrock_2d(p, 1.0, 100.0))
368//! # }
369//! #
370//! # fn gradient(&self, p: &Vec<f64>) -> Result<Vec<f64>, Error> {
371//! # Ok(rosenbrock_2d_derivative(p, 1.0, 100.0))
372//! # }
373//! # }
374//! #
375//! # fn run() -> Result<(), Error> {
376//! # // define inital parameter vector
377//! # let init_param: Vec<f64> = vec![1.2, 1.2];
378//! #
379//! # let iters = 35;
380//! # let solver = Landweber::new(0.001);
381//! #
382//! let res = Executor::from_checkpoint(".checkpoints/optim.arg", Rosenbrock {})
383//! .unwrap_or(Executor::new(Rosenbrock {}, solver, init_param))
384//! .max_iters(iters)
385//! .checkpoint_dir(".checkpoints")
386//! .checkpoint_name("optim")
387//! .checkpoint_mode(CheckpointMode::Every(20))
388//! .run()?;
389//! #
390//! # // Wait a second (lets the logger flush everything before printing to screen again)
391//! # std::thread::sleep(std::time::Duration::from_secs(1));
392//! # println!("{}", res);
393//! # Ok(())
394//! # }
395//! #
396//! # fn main() {
397//! # if let Err(ref e) = run() {
398//! # println!("{}", e);
399//! # }
400//! # }
401//! ```
402//!
403//! # Implementing an optimization algorithm
404//!
405//! In this section we are going to implement the Landweber solver, which essentially is a special
406//! form of gradient descent. In iteration `k`, the new parameter vector `x_{k+1}` is calculated
407//! from the previous parameter vector `x_k` and the gradient at `x_k` according to the following
408//! update rule:
409//!
410//! `x_{k+1} = x_k - omega * \nabla f(x_k)`
411//!
412//! In order to implement this using the argmin framework, one first needs to define a struct which
413//! holds data specific to the solver. Then, the `Solver` trait needs to be implemented for the
414//! struct. This requires setting the associated constant `NAME` which gives your solver a name.
415//! The `next_iter` method defines the computations performed in a single iteration of the solver.
416//! Via the parameters `op` and `state` one has access to the operator (cost function, gradient
417//! computation, Hessian, ...) and to the current state of the optimization (parameter vectors,
418//! cost function values, iteration number, ...), respectively.
419//!
420//! ```rust
421//! use argmin::prelude::*;
422//! use serde::{Deserialize, Serialize};
423//!
424//! // Define a struct which holds any parameters/data which are needed during the execution of the
425//! // solver. Note that this does not include parameter vectors, gradients, Hessians, cost
426//! // function values and so on, as those will be handled by the `Executor`.
427//! #[derive(Serialize, Deserialize)]
428//! pub struct Landweber<F> {
429//! /// omega
430//! omega: F,
431//! }
432//!
433//! impl<F> Landweber<F> {
434//! /// Constructor
435//! pub fn new(omega: F) -> Self {
436//! Landweber { omega }
437//! }
438//! }
439//!
440//! impl<O, F> Solver<O> for Landweber<F>
441//! where
442//! // `O` always needs to implement `ArgminOp`
443//! O: ArgminOp<Float = F>,
444//! // `O::Param` needs to implement `ArgminScaledSub` because of the update formula
445//! O::Param: ArgminScaledSub<O::Param, O::Float, O::Param>,
446//! F: ArgminFloat,
447//! {
448//! // This gives the solver a name which will be used for logging
449//! const NAME: &'static str = "Landweber";
450//!
451//! // Defines the computations performed in a single iteration.
452//! fn next_iter(
453//! &mut self,
454//! // This gives access to the operator supplied to the `Executor`. `O` implements
455//! // `ArgminOp` and `OpWrapper` takes care of counting the calls to the respective
456//! // functions.
457//! op: &mut OpWrapper<O>,
458//! // Current state of the optimization. This gives access to the parameter vector,
459//! // gradient, Hessian and cost function value of the current, previous and best
460//! // iteration as well as current iteration number, and many more.
461//! state: &IterState<O>,
462//! ) -> Result<ArgminIterData<O>, Error> {
463//! // First we obtain the current parameter vector from the `state` struct (`x_k`).
464//! let xk = state.get_param();
465//! // Then we compute the gradient at `x_k` (`\nabla f(x_k)`)
466//! let grad = op.gradient(&xk)?;
467//! // Now subtract `\nabla f(x_k)` scaled by `omega` from `x_k` to compute `x_{k+1}`
468//! let xkp1 = xk.scaled_sub(&self.omega, &grad);
469//! // Return new paramter vector which will then be used by the `Executor` to update
470//! // `state`.
471//! Ok(ArgminIterData::new().param(xkp1))
472//! }
473//! }
474//! ```
475//!
476//! # TODOs
477//!
478//! * More optimization methods
479//! * Automatic differentiation
480//! * Parallelization
481//! * Tests
482//! * Evaluation on real problems
483//! * Evaluation framework
484//! * Documentation & Tutorials
485//! * C interface
486//! * Python wrapper
487//! * Solver and problem definition via a config file
488//!
489//! Please open an [issue](https://github.com/argmin-rs/argmin/issues) if you want to contribute!
490//! Any help is appreciated!
491//!
492//! # License
493//!
494//! Licensed under either of
495//!
496//! * Apache License, Version 2.0,
497//! ([LICENSE-APACHE](https://github.com/argmin-rs/argmin/blob/master/LICENSE-APACHE) or
498//! http://www.apache.org/licenses/LICENSE-2.0)
499//! * MIT License ([LICENSE-MIT](https://github.com/argmin-rs/argmin/blob/master/LICENSE-MIT) or
500//! http://opensource.org/licenses/MIT)
501//!
502//! at your option.
503//!
504//!
505//! ## Contribution
506//!
507//! Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion
508//! in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above,
509//! without any additional terms or conditions.
510
511#![warn(missing_docs)]
512#![allow(unused_attributes)]
513// Explicitly disallow EQ comparison of floats. (This clippy lint is denied by default; however,
514// this is just to make sure that it will always stay this way.)
515#![deny(clippy::float_cmp)]
516
517extern crate rand;
518
519/// Core functionality
520#[macro_use]
521pub mod core;
522
523/// Definition of all relevant traits and types
524pub mod prelude;
525
526/// Solvers
527pub mod solver;
528
529/// Macros
530#[macro_use]
531mod macros;
532
533#[cfg(test)]
534mod tests;