Volume 2008, Issue 1 861275
Research Article
Open Access

The Generalized PSO: A New Door to PSO Evolution

J. L. Fernández Martínez

Corresponding Author

J. L. Fernández Martínez

Department of Mathematics University of Oviedo C/Calvo Sotelo S/N 33007 Oviedo, Spain

Search for more papers by this author
E. García Gonzalo

E. García Gonzalo

Department of Mathematics University of Oviedo C/Calvo Sotelo S/N 33007 Oviedo, Spain

Search for more papers by this author
First published: 12 May 2008
Citations: 51
Academic Editor: Riccardo Poli

Abstract

A generalized form of the particle swarm optimization (PSO) algorithm is presented. Generalized PSO (GPSO) is derived from a continuous version of PSO adopting a time step different than the unit. Generalized continuous particle swarm optimizations are compared in terms of attenuation and oscillation. The deterministic and stochastic stability regions and their respective asymptotic velocities of convergence are analyzed as a function of the time step and the GPSO parameters. The sampling distribution of the GPSO algorithm helps to study the effect of stochasticity on the stability of trajectories. The stability regions for the second-, third-, and fourth-order moments depend on inertia, local, and global accelerations and the time step and are inside of the deterministic stability region for the same time step. We prove that stability regions are the same under stagnation and with a moving center of attraction. Properties of the second-order moments variance and covariance serve to propose some promising parameter sets. High variance and temporal uncorrelation improve the exploration task while solving ill-posed inverse problems. Finally, a comparison is made between PSO and GPSO by means of numerical experiments using well-known benchmark functions with two types of ill-posedness commonly found in inverse problems: the Rosenbrock and the “elongated” DeJong functions (global minimum located in a very flat area), and the Griewank function (global minimum surrounded by multiple minima). Numerical simulations support the results provided by theoretical analysis. Based on these results, two variants of Generalized PSO algorithm are proposed, improving the convergence and the exploration task while solving real applications of inverse problems.

1. Introduction

Particle swarm optimization (PSO) is an evolutionary computation technique [1] for optimization which is inspired by the social behavior of individuals in groups in nature. The particle swarm algorithm applied to optimization problems is very simple.

(1)Individuals, or particles, are represented by vectors whose length is the number of degrees of freedom of the optimization problem.

(2) To start, a population of particles is initialized with random positions () and velocities (). A misfit or cost function is evaluated for each particle.

(3) As time advances, the positions and velocities of each particle are updated as a function of its own history of misfits, and the misfit of its neighbors. At time-step k + 1, positions and velocities of the individuals are calculated as follows:
()
with
()
where is the ith particle best position, gk  is the global best position found so far, ϕ1, ϕ2 are the global and local accelerations, and ω is a real constant, called inertia.
The real scalar 1 in (1) stands for the time step necessary to make this algorithm dimensionally correct, as corresponds to the relationship between trajectories, velocities, and accelerations. The random numbers, r1  and r2, affect the local and global acceleration terms, al  and ag, causing the particle trajectory to oscillate at each iteration around its center [2], named
()
Thus, relative values of ϕ1 and ϕ2 affect the balance between local and global search. PSO algorithm has been also adapted to incorporate a different kind of topologies concerning the “informant group.”

Convergence properties and PSO trajectories were analyzed, and helped to clarify some numerical aspects of the PSO algorithm; see [26]. Also, based on these analysis, some parameter selection strategies were proposed (see, e.g., [5]).

Stability analysis of the particle dynamics has also been done in presence of randomness by Kadirkamanathan et al. [7], using Lyapunov techniques. As result of this analysis, they found a sufficient condition on (ω,ϕ) to achieve convergence:
()
Nevertheless, this condition is very restrictive, and this region does not include the best parameters found in the literature.

Physical analogy with a damped mass-spring oscillator was introduced by Brandstätter and Baumgartner [8] to optimize electrical engineering problems, nevertheless, this analogy was not completely exploited in their work. Recently, Fernández Martínez et al. [9] used this analogy to derive the PSO continuous model and to present a systematic study of PSO trajectories by means of stability analysis of the deterministic PSO difference equations involved and fixed point techniques. The PSO convergence can then be explained in terms of some combined criteria: trajectory attenuation, trajectory oscillation, and center attraction potential, explaining the success in achieving convergence of some parameter sets found in the literature.

Interdisciplinary approaches, from classical Newtonian mechanics to molecular dynamics theory, have been recently proposed by Mikki and Kishk [10] to study the thermodynamic behavior of PSO, providing new criterion to analyze the algorithm convergence. Also, stagnation analysis [11] and statistical approaches [12] were used to characterize the sampling distribution of PSO in presence of stochasticity and to provide criteria for the PSO parameter selection.

In this paper, we present a generalized form of the particle swarm optimization (PSO), which is derived from a continuous version of PSO. This idea has been partially presented in [9, 10, 13], but in this manuscript is fully developed.

(1) The deterministic and stochastic stability regions and the asymptotic velocities of convergence of the GPSO algorithm are analyzed as a function of the time step, the local and global accelerations, and the inertia value.

(2) Generalized and continuous PSO models are compared in terms of attenuation and oscillation properties as a function of the time step and the PSO parameters. As expected, GPSO properties tend to those of the continuous PSO model, as the time step goes to zero.

(3) The sampling distribution of the GPSO algorithm is also analyzed and helps to study the effect of stochasticity on the stability of trajectories. The stability region for the second-, third-, and fourth-order moments depends on inertia, local, and global accelerations and time discretization step. These regions are smaller in size than the corresponding deterministic stability region for the same time step. Regions increase in size as time step decreases. Also, for a given time step, the maximum size of the second-order stability region occurs when local and global accelerations are equal. Properties of the second-order moments—variance, covariance, variogram, and correlogram—are also explored with a moving center of attraction, assuming that l(t)  and g(t)  exhibit a deterministic behavior. This analysis serve to propose some promising parameter sets. High variance and temporal uncorrelation improve the exploration task needed to solve ill-posed inverse problems. A similar idea has been also proposed by Clerc [11] under stagnation.

(4) Finally, a comparison is made between PSO and the GPSO by means of numerical experiments using well-known benchmark functions supporting the theoretical results. Based on these results two variants of generalized PSO algorithm are proposed, improving the convergence and the exploration task while solving real applications in inverse problems modeling.

2. The Pso Difference Equations

In the PSO algorithm, the following vectorial difference equation is involved for each particle in the swarm:
()
where ϕ = ϕ1 + ϕ2  is a random variable with trapezoidal (or triangular if al = ag) distribution, and i is the particle number.
Let us introduce the variables , that is, particle′s positions in iteration k  are referred to their oscillation center (3).   Then, the following second-order difference equation is involved:
()
When the oscillation center stabilizes (stagnation case), then the PSO algorithm involves the following vectorial stochastic difference equation [6]:
()
In this model, trajectories are considered as random functions.

Let us introduce the PSO deterministic case which is modeled by difference equation (7) and ϕ  is in this case is a real constant. As we will show on the stochastic stability analysis, this model describes the mean behavior of random trajectories, , when is in this case the average value of random variable ϕ.

The deterministic stability region of the PSO is the part of the space , where the roots of the characteristic equation
()
are in the unit circle. This region turns to be [4]
()
The border line between complex and real roots of the characteristic equation (8) is the parabola
()
Model (7) has been generalized, for any time t, to the so-called PSO-discrete functional model,
()
that provides the continuous analytical function that overlaps the discrete PSO points, and allows us to perform a systematic study of the PSO trajectories under stagnation [9]. Four different zones have been differentiated (see Figure 1) showing a different behavior, depending on the character of the eigenvalues of the characteristic equation (8) associated to the PSO difference equation.
Details are in the caption following the image
Trajectories for particles in different zones of the PSO deterministic stability region. ZONE 1: complex zone, ZONE 2: real symmetrical oscillating zone, ZONE 3: real asymmetrical oscillating zone, ZONE 4: real nonoscillating zone. The points represent the PSO values. Continuous line is the solution of the PSO-discrete model at real times. Envelopes curves for trajectories are also shown.
Details are in the caption following the image
Trajectories for particles in different zones of the PSO deterministic stability region. ZONE 1: complex zone, ZONE 2: real symmetrical oscillating zone, ZONE 3: real asymmetrical oscillating zone, ZONE 4: real nonoscillating zone. The points represent the PSO values. Continuous line is the solution of the PSO-discrete model at real times. Envelopes curves for trajectories are also shown.
Details are in the caption following the image
Trajectories for particles in different zones of the PSO deterministic stability region. ZONE 1: complex zone, ZONE 2: real symmetrical oscillating zone, ZONE 3: real asymmetrical oscillating zone, ZONE 4: real nonoscillating zone. The points represent the PSO values. Continuous line is the solution of the PSO-discrete model at real times. Envelopes curves for trajectories are also shown.
Details are in the caption following the image
Trajectories for particles in different zones of the PSO deterministic stability region. ZONE 1: complex zone, ZONE 2: real symmetrical oscillating zone, ZONE 3: real asymmetrical oscillating zone, ZONE 4: real nonoscillating zone. The points represent the PSO values. Continuous line is the solution of the PSO-discrete model at real times. Envelopes curves for trajectories are also shown.
Details are in the caption following the image
Trajectories for particles in different zones of the PSO deterministic stability region. ZONE 1: complex zone, ZONE 2: real symmetrical oscillating zone, ZONE 3: real asymmetrical oscillating zone, ZONE 4: real nonoscillating zone. The points represent the PSO values. Continuous line is the solution of the PSO-discrete model at real times. Envelopes curves for trajectories are also shown.
The PSO algorithm can also be written in terms of the absolute position and velocity (xk,vk)  in each iteration, as the stochastic dynamical system
()
where
()
The same considerations about random variables ϕ, ϕ1, ϕ2, and the use of fixed point techniques applied to (12), allow us to determine the deterministic stability region for the (xk+1,vk+1)  system. The deterministic stability region with moving center of attraction turns to be SD. Also, this methodology allows us to obtain information about the asymptotic velocity of convergence, as a function the spectral radius ρ of the iteration matrix MPSO  [9]:
()
This velocity tends to infinite on the vertex of the parabola (10), that is, the point .

3. The Pso Continuous Model

Let us introduce the following stochastic differential equation:
()
where l(t) and g(t) are respectively the trajectories of the global and local best trajectories associated to each particle in the swarm, and ϕ, ϕ1, ϕ2 are random variables.
This can be considered the PSO continuous model, describing the continuous movement of each coordinate of any particle in the swarm. Referred to the oscillation center,
()
the continuous model becomes
()
As mentioned before, this last differential equation becomes homogeneous when the oscillation center is stable (stagnation), helping to simplify the PSO analysis.
Let us consider for each of the ith particle coordinates, a centered discretization in acceleration and a regressive schema in velocity in time t = k,  and a unit discretization time step, Δt = 1:
()
It is straightforward to arrive at the PSO-discrete functional model (5).

The PSO differential model (15) has been deduced from a mechanical analogy [9]: a damped mass-spring system of unit mass, damping factor,  b = 1 − w, and stiffness constant, k = ϕ. The knowledge of the trajectories of both the continuous and the discrete model helped to explain why some zones of the parameter space (w,ϕ) provide better convergence, in terms of some combined criteria: trajectory attenuation, trajectory oscillation, and center attraction potential. The same conclusions are true for the GPSO as it will be demonstrated.

4. The Generalized Pso

The idea we propose here is to generalize the PSO algorithm for any time step, Δt. Without loss of generality, the theoretical development is done in one dimension, as if we were reasoning separately for each particle coordinate.

Let us consider the same discretization schemes, mentioned above, for velocity and acceleration at time t,
()
and introduce them on the PSO continuous model (15). Then, the following second-order difference equation is obtained for any real times t and t + Δt:
()
where
()
Stochastic functional equation (20) corresponds to the so-called generalized PSO algorithm.
The GPSO algorithm can be written in terms of the absolute position and velocity (x(t),v(t)) as follows:
()
Introducing Δx(t + Δt) = v(t + Δt)Δt,  the algorithm becomes
()
This model is valid for any of the particles of the swarm in any dimension. Considering t = k, Δt = 1, we arrive at the PSO algorithm. Also, taking into account expression (1), it is easy to show that this algorithm corresponds to a modified PSO with inertia
()
and total acceleration
()
and thus, the effect of decreasing Δt  (PSO corresponds to Δt = 1)  is to lower the total acceleration according to Δt2, and increasing the inertia proportionally to Δt. Thus, the unit-mass damped spring system involved has a damping factor
()
instead of b = 1 − w,  and a stiffness constant
()
instead of k = ϕ.  This means that the swarm movement becomes more elastic and less damped for Δt → 0.  In fact, the pair (wΔt,ϕΔt) follows the parabola
()
approaching to the point (wΔt,ϕΔt) = (1,0), as Δt decreases.

4.1. GPSO deterministic stability analysis

This approach allows us also to calculate the deterministic stability region of the generalized PSO.

4.1.1. Stagnation case

In case of stagnation, the GPSO difference equation referred to the oscillation center becomes
()
where A,B are given by (21).
The deterministic stability region is the part of the space , where the roots of the characteristic equation
()
are in the unit circle. This region (Figure 2) turns to be
()
The shadowed part represents the zone where roots are complex. In the rest, both roots are real. Parabola separating real and complex roots is given by
()
One can show that as Δt → 0, the stability region of the generalized PSO tends to the continuous stability region, that is,
()
The upper border line of the PSO algorithm
()
is in fact due to the time discretization adopted for PSO (Δt = 1),  and comes from a mathematical restriction to achieve stability. Fernández Martínez et al. [9] mentioned this effect, comparing the PSO stability region to its continuous counterpart. Border line (32) separating real and complex roots also follows a similar behavior. Figure 3 shows numerically, when Δt → 0, how the discrete stability zone increases its size, approaching the continuous region (33). Figure 4 shows that, for a given parameter set (w,ϕ)  on the stability zone, the GPSO trajectories tend to the continuous as Δt decreases, and thus, the GPSO sampling becomes more dense. On the opposite, as Δt  increases, the stability triangle of the generalized PSO shrinks.
Details are in the caption following the image
Deterministic stability region for the generalized PSO.
Details are in the caption following the image
Evolution of stablity zone for different decreasing Δt, approaching the continuous case.
Details are in the caption following the image
Convergence of the generalized PSO trajectories to the continuous as Δt → 0.  Black points represent GPSO values.
Details are in the caption following the image
Convergence of the generalized PSO trajectories to the continuous as Δt → 0.  Black points represent GPSO values.
Details are in the caption following the image
Convergence of the generalized PSO trajectories to the continuous as Δt → 0.  Black points represent GPSO values.

4.1.2. Moving center of attraction

The same deterministic stability region, SGPSO, can be deduced applying fixed point techniques to the GPSO system:
()
This proves that the GPSO deterministic stability analysis shown before is also valid when the oscillation center changes.

Figure 5 shows the deterministic spectral radius—associated to the asymptotic velocity of convergence—for the generalized PSO. The similarity with the PSO case (Δt = 1) is absolute, and the black zone of high velocity increases in size. The asymptotic velocity tends to infinite on the parabola vertex, (1 − 1/Δt, 1/Δt2).

Details are in the caption following the image
Spectral radius for PSO and for GPSO with time step Δt = 0.5.  Black area close to the parabola vertex stands for null spectral radius.
Details are in the caption following the image
Spectral radius for PSO and for GPSO with time step Δt = 0.5.  Black area close to the parabola vertex stands for null spectral radius.

5. Some Interesting Properties of The Gpso

With GPSO, we have performed a numerical analysis that seeks to explain why some pairs of (w,ϕ) in certain zones of the deterministic convergence triangle work in achieving convergence and why others do not.

5.1. Center attraction

For a GPSO simplified model where , without objective function, two tests have been made: one deterministic (Figure 6(a)) and the other using random ϕ parameters (Figure 6(b)).

Details are in the caption following the image
Figure 6 (a) Deterministic case
Effect of the randomness in the center attraction capability for GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 6 (b) Random case
Effect of the randomness in the center attraction capability for GPSO with Δt = 0.5.

For the deterministic one,

  • (1)

    100 particles, are randomly initialized over the two dimensional domain [−100,100] × [−100,100];

  • (2)

    a parameter grid (ω,ϕ)  is defined over the region (ω,ϕ) ∈ [−3,1] × [0,18]  with the following grid steps: Δw = 0.06,  Δϕ = 0.02;

  • (3)

    each particle ξp=(xp,yp)  follows the simplified trajectory:

    ()
    for each point (ω,ϕ)  of the parameter grid;

  • (4)

    for each (ω,ϕ),  the particle nearest to the attractor point—(0,0)  in this case—is found after 100 iterations. Its distance to the attractor point,  dmin (ω,ϕ), is calculated and expressed in logarithmic scale;

  • (5)

    The contour map log dmin (ω,ϕ)  is plotted over the parameter region [−3,1] × [0,18].

In the random case, the same test has been performed, but 300 simulations have been made for each (ω,ϕ) (instead of only one simulation in the deterministic case) and the median of dmin (ω,ϕ) has been computed. It can be observed that in the upper-right zone of the convergence triangle there is no convergence (Figure 6(b)). This is due to the high amplitudes and frequencies of the trajectories in this zone.

5.2. Comparison to the PSO continuous model

It has been observed that in zones of optimal convergence, the continuous and discrete trajectories differ a little [9]. Figure 7 shows the relative logarithmic misfit between the GPSO for Δt = 0.5 and the PSO continuous model, both using their trajectories and/or their respective envelope curves. It can be observed that as Δt  decreases, the size of the zone of similarity increases. The PSO stability triangle is embedded in this zone.

Details are in the caption following the image
Figure 7 (a) Misfit using trajectories
GPSO for Δt = 0.5.  (a) logarithmic relative misfit L2  error between continuous and discrete solutions; (b) logarithmic relative misfit L2  error using envelope curves.
Details are in the caption following the image
Figure 7 (b) Misfit using envelopes
GPSO for Δt = 0.5.  (a) logarithmic relative misfit L2  error between continuous and discrete solutions; (b) logarithmic relative misfit L2  error using envelope curves.

5.3. Attenuation and oscillation

No convergence close to the point (w,ϕ) = (1 − 1/Δt,1/Δt2), vertex of the parabola, can be explained by the quick attenuation of the trajectories amplitude and fast convergence to the oscillation center, that in the first moments of the application of the algorithm is usually far from the optimum. Figure 8 shows, for different (w,ϕ) points on the stability region, the number of iterations (expressed in logarithmic scale) needed to reduce the amplitude of the GPSO trajectories in 90% for Δt = 0.5, and the comparison to PSO and to the continuous case. As in the previous figure, similarity with the continuous increases as Δt → 0.

Details are in the caption following the image
Figure 8 (a) Continuous
Number of iterations needed to decrease 90% in amplitude, expressed in logarithmic scale, with initial conditions xi(0) = 1,  νi(0) = 0. (a) continuous case, (b) PSO, (c) GPSO with Δt = 0.5.
Details are in the caption following the image
Number of iterations needed to decrease 90% in amplitude, expressed in logarithmic scale, with initial conditions xi(0) = 1,  νi(0) = 0. (a) continuous case, (b) PSO, (c) GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 8 (c) GPSO Δ t = 0.5
Number of iterations needed to decrease 90% in amplitude, expressed in logarithmic scale, with initial conditions xi(0) = 1,  νi(0) = 0. (a) continuous case, (b) PSO, (c) GPSO with Δt = 0.5.

The area between envelopes for the discrete trajectories has been used as a measure of the exploratory capacity of the PSO. It can be seen that it is higher as w increases in the complex zone and near to the upper convergence limit in the real zone. Figure 9 shows the area between envelope curves for the GPSO with Δt = 0.5  and its comparison to the PSO case. These two last magnitudes (attenuation and oscillation) have been proposed as measures of the exploration capabilities associated to the deterministic PSO trajectories [9].

Details are in the caption following the image
Area between envelope curves using logarithmic scale. (a) PSO, (b) GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 9 (b) GPSO Δt = 0.5
Area between envelope curves using logarithmic scale. (a) PSO, (b) GPSO with Δt = 0.5.

6. The Sampling Distribution of The Gpso Algorithm

In this section, we analyze the effect of stochasticity on the stability of GPSO trajectories using the methodology first proposed by Poli [12]. Trajectories are modeled as stochastic processes whose univariate and bivariate statistical moments must satisfy the PSO difference equations involved. The methodology consists in discretizing the PSO continuous model (15) and applying fixed point analysis to the dynamical systems deduced for the first- to fourth-order moments describing the trajectories stability. This analysis is performed under stagnation and with a moving center of attraction. We show that the stability regions are the same in both cases.

This methodology becomes, nevertheless, complicated and time consuming since it involves a linear system of the kind
()
with iteration matrix MnM(n(n + 3)/2, n(n + 3)/2), for controlling the stability of the n-order moments.

6.1. Stagnation case

6.1.1. First- and second-order moments

Let us consider the GPSO difference equation
()
where A,B  are given by (21).
Let us also consider the vector describing the dynamics of the first- and second-order moments:
()
It is straight forward to show that the stochastic GPSO dynamics can be written on algebraic form as
()
where
()
The following results are of interest.

(1) Stability of the first-order moments only involves the matrix , and logically provides the same results as GPSO deterministic stability analysis.

(2) The stability of the second-order moments involves the matrix
()
(3) Analysis of the fixed point linear system involved
()
has allowed us to determine the border line of the second order stability region. Calling it is possible to show that this line is a hyperbola and has the following explicit equation:
()
This line generalizes for any Δt, the border line of the second-order stability region deduced by Poli [12] for Δt = 1 (PSO) and α = 1  (al = ag). The GPSO stochastic stability region turns to be
()
which is embedded on the deterministic stability region SGPSO. Also, it is possible to show that this region reaches its maximum size if α = 1 (al = ag).

Figure 10 shows the contour lines of the spectral radius of the stochastic iteration matrix on the unit circle for Δt = 0.5  and α = 1. As it can be observed, the isolines bend on their upper part, remembering the results shown in Figure 6(b).

Details are in the caption following the image
Contour plot spectral radius of iteration matrix for al = ag  (α = 1) and Δt = 0.5 in the second-order stability region.

Finally, the second-order stability region increases its size as Δt diminishes. Besides, GPSO algorithm with Δt > 1 introduces for the same (w,ϕ) a higher variance on the trajectories.

The same methodology can be applied to determine the skewness and kurtosis stability regions. Results are presented for a moving center of attraction.

6.2. Moving center of attraction

Let us consider the GPSO difference equation in absolute positions:
()
where A,B  are given by (21). In what follows we will consider g(t) and l(t)  as deterministic functions.
Let us also consider the vector
()
which describes the first- and second-order GPSO “absolute” dynamics. The GPSO system can be written on algebraic form as
()
where
()
In this case, the dynamics of the second-order moments also depend on the first-order moments.

One can show the following.

(1) The stochastic stability region for the first- and second-order moments is the same that in case of stagnation.

(2) The fix points for y2(t + Δt) are the following:
()
()
()
where
()
and tp stands for the time when the fixed point is reached.

(3) The variance increases its value from 0  in until + on the stochastic border line. Isolines are shown in Figure 11(a). Variance is also null for α = 2 (al = 0), α = 0 (ag = 0).

Details are in the caption following the image
Figure 11 (a) Variance (log)
Contour plots of the main second-order moments for al = 2ag  (α = 0.5) and Δt = 0.8: (a) variance (in logaritmic scale), (b) covariance (sign(cov)|log |cov||), (c) variogram (log), (d) correlation coefficient.
Details are in the caption following the image
Figure 11 (b) Covariance (sign(cov)|log |cov||)
Contour plots of the main second-order moments for al = 2ag  (α = 0.5) and Δt = 0.8: (a) variance (in logaritmic scale), (b) covariance (sign(cov)|log |cov||), (c) variogram (log), (d) correlation coefficient.
Details are in the caption following the image
Figure 11 (c) Variogram (log)
Contour plots of the main second-order moments for al = 2ag  (α = 0.5) and Δt = 0.8: (a) variance (in logaritmic scale), (b) covariance (sign(cov)|log |cov||), (c) variogram (log), (d) correlation coefficient.
Details are in the caption following the image
Figure 11 (d) Correlation coefficient
Contour plots of the main second-order moments for al = 2ag  (α = 0.5) and Δt = 0.8: (a) variance (in logaritmic scale), (b) covariance (sign(cov)|log |cov||), (c) variogram (log), (d) correlation coefficient.

(4) The covariance is zero in and in the line , which is one of the median of the deterministic stability triangle. Covariance is negative above this line. Figure 11(b) shows the contour plot of sgn(Cov) · |ln |Cov||⋅ Covariance is also null for any Δt if al = 0  or ag = 0.

(5) A useful measure for trajectories dispersion is the variogram, defined as
()
Variogram shows that dispersion increases from 0 in until + on the second-order stability border line. The variogram is also null for any Δt if al = 0  or ag = 0. Isolines are shown in Figure 11(c).
(6) The correlation coefficient is
()

for any point on the stability region of the second-order moments, and does not depend on α, neither on   the local and global best trajectories. Correlation coefficient is 1 in and is null on the line Isolines are straight lines passing in (w,ϕ) = (1 − 2/Δt,0)  (Figure 11(d)).

(7) A priori a good parameter choice is given by the intersection by the line, (no correlation between trajectories), with the border line of the stochastic stability region (maximum variance and dispersion). The point of intersection has the following coordinates:
()
which in the case of PSO and α = 1 (maximum size of the stochastic stability region) is
(8) This analysis can be generalized when l(t) and g(t)  are considered as stochastic processes. (The results presented here are valid considering that the center of attraction and the trajectories are independent. Obviously, this is a wrong hypothesis. The analysis of the general case will be addressed in a future paper devoted to this important subject.) The first- and second-order stability zones do not change. The mean is, in this case,
()
which generalizes the expression (50). The variance, covariance, and variogram show now a more complicated dependency which involves the first- and second-order moments of l(t)  and g(t)  in tp,  instead of For instance, the variance and covariance become
()
These expressions simplifies to (51), (52), and (54) in case l(t) and g(t)  are deterministic. Correlogram (55) has the same expression in both cases.

6.2.1. Skewness

The stability analysis for the third-order moments involves a linear system with an 9 × 9  iteration matrix. The iteration matrix extra terms corresponding to the different third-order moments  (E(x3(t)),E(x3(tΔt)),E(x2(t)x(tΔt)),E(x(t)x2(tΔt)))  are
()

Our analysis provides the following results.

  • (i)

    The region where the skewness fix point exists coincides with the region of stability of the second-order moments, nevertheless, the region of skewness stability (where the fixed point is correctly approximated by the iterative PSO algorithm) is smaller in size than the region   of second-order stability. Both regions coincide for many w  values on (−1,1). This result has been also pointed by Poli [12] in the PSO case, for inertia values restricted to the interval [0,1]. Also, the region of skewness stability tends to the region of second-order stability as time step, Δt, goes to zero.

  • (ii)

    Skewness is null for any Δt  if α = 1  (ag = al).  Skewness is positive in the upper zone under the second-order stability hyperbola for α < 1 (ag < al),  and negative in the bottom zone. For α > 1 (ag > al)  the sign swaps between these two zones for any Δt (Figure 12(a)). Skewness does only depend on the sign of g(tp) − l(tp) but not on its absolute value.

Details are in the caption following the image
Figure 12 (a) Skewness
(a) Contour plot of skewness for al = 2ag (α = 0.5) and Δt = 0.8. (b) Stability regions of orders 1, 2, 3, and 4.
Details are in the caption following the image
Figure 12 (b) Stability regions for order 1, 2, 3, and 4
(a) Contour plot of skewness for al = 2ag (α = 0.5) and Δt = 0.8. (b) Stability regions of orders 1, 2, 3, and 4.

6.2.2. Kurtosis

Kurtosis stability analysis involves a linear system with 14 × 14  iteration matrix.

The main results are the following.

  • (i)

    The regions of orders 1, 2, 3, and 4 are nested, being the region of kurtosis stability the smallest on size (Figure 12(b)). This imply that while mean, variance, covariance, and skewness are stable, the kurtosis grows indefinitely. This last result has been also pointed by [12] in the PSO case under stagnation.

  • (ii)

    All these regions of stability tend to coincide as Δt  decreases (GPSO case with Δt < 1). In the limit (Δt → 0), they tend to the continuous stability region (33).

7. Numerical Experiments on Benchmark Functions

To confirm the above-mentioned theoretical results, we ran several numerical experiments on benchmark functions with two types of ill-posedness commonly found in inverse problems: the Rosenbrock and the “elongated” DeJong functions (global minimum located in a very flat area), and the Griewank function (global minimum surrounded by multiple minima). A similar analysis has been done in [9] for the PSO case, showing that as the ill-posedness increases, PSO parameters from the complex stability region, with inertia values from 0.5 to 0.9, and medium to high total accelerations seem to give systematically very good results. Also, in case of cost functions with multiple local minima, the real stability zone of negative inertia values (around −0.55) and total acceleration from 0.6 to 1.2 give very good performance with respect to the percentage of success.

Figure 13 shows the percentage of times (over 100 simulations) that PSO and GPSO (with Δt = 0.5) arrive very close (within a tolerance of 10−5) to the global optimum after 100 iterations. It can be observed that the convergence zone increases in size and the best parameters move towards decreasing inertia values and increasing accelerations, confirming our previous theoretical analysis. Good parameter sets are close to the limit of stochastic stability region (hyperbola (44)). Only for the elongated DeJong function good parameter sets are partly above this line. This parameter region has also been pointed by Clerc [11] in the PSO case, under the stagnation hypothesis, and only for positive inertia values.

Details are in the caption following the image
Figure 13 (a) PSO Rosenbrock
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 13 (b) PSO De Jong
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 13 (c) PSO Griewank
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 13 (d) GPSO Δt = 0.5 Rosenbrock
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 13 (e) GPSO Δt = 0.5 De Jong
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.
Details are in the caption following the image
Figure 13 (f) GPSO Δt = 0.5 Griewank
Percentage of success for different functions for PSO and GPSO with Δt = 0.5.

For the Griewank function parameters between the median of the deterministic triangle and the hyperbola (44), and inertia values around −0.5 (in the PSO case) give also a very good performance. In the GPSO case, the inertia values are shifted −1/Δt. Negative PSO inertia values have given good results in the PSO case for cost functions with multiple minima [9].

Figure 14 shows the same numerical analysis for the Rosenbrock function defined in 10, both in the PSO (α = 1) and GPSO cases (Δt = 0.8   and  α = 1), under the same tolerance requirements. As the number of dimensions increases, the best parameter sets seem to fit better the limit of the second-order stability. Also, the region of negative inertia values seems to be more important in size than in dimension 2.  In the GPSO case, the “good” parameter region increases its size.

Details are in the caption following the image
Figure 14 (a) Rosenbrock PSO
Percentage of success for the Rosenbrock function in 10 for PSO and GPSO with Δt = 0.8.
Details are in the caption following the image
Figure 14 (b) Rosenbrock GPSO Δt = 0.8
Percentage of success for the Rosenbrock function in 10 for PSO and GPSO with Δt = 0.8.

7.1. Time-step adapted GPSO algorithms

As a result of the above-mentioned analysis and numerical experiments, we propose the following algorithm.

  • (1)

    To specify the search space limits, and to initialize swarm positions randomly within, velocities are initialized to zero and are not limited.

  • (2)

    To choose the parameters for the standard PSO (Δt = 1), as a general rule, a good choice is an inertia value ω  in the range (0.5,0.8)  and a total acceleration given by (44). For cost functions with multiple local minima, inertia values in (−0.6, − 0.5)  and total acceleration following the same rule (44) seem to give also very good results.

  • (3)

    At the exploration stage, the local acceleration ϕ2  should be bigger than the global term, ϕ1 (α < 1).

  • (4)

    At the convergence stage, we propose two different variants with respect to the PSO case.

(a) GPSO algorithm with Δt < 1 The idea is to monitor the percentage of models having at least one parameter on the lower or upper limit of the model search space as a function of iterations. If this percentage decreases significantly (which is expected mainly in the convergence stage), the time step Δt  is reduced (e.g., Δt = 0.8 − 0.9) in order to adapt the PSO amplitudes to locate accurately the minima. Numerical experiments have shown that at the exploration stage it is common that some model parameters reach the search space limits. Nevertheless, this circumstance does not necessarily imply a time-step reduction. On the contrary, at the convergence stage it is important to adapt the PSO amplitudes in order to efficiently locate the optimum.

(b) Mixed GPSO algorithm This algorithm consists in adopting Δt = 1.2 and 0.8 alternatively for the odd and even iterations. When Δt = 1.2  the exploration capabilities increase (higher variance to avoid local minima) and for Δt = 0.8  the search is done more accurately around the global best.

We ran several numerical experiments with the Rosenbrock, DeJong, and Griewank functions. Figure 15 shows the percentage of times (over 100 simulations) that mixed GPSO arrives very close (within a tolerance of 10−5) to the global optimum after 100 iterations. It can be observed that good parameter regions are close to both hyperbolas of the second-order stability (Δt = 1.2  and  Δt = 0.8).

Details are in the caption following the image
Figure 15 (a) Rosenbrock
Percentage of success for the De Jong, Rosenbrock, and Griewank functions for the mixed GPSO algorithm.
Details are in the caption following the image
Figure 15 (b) De Jong
Percentage of success for the De Jong, Rosenbrock, and Griewank functions for the mixed GPSO algorithm.
Details are in the caption following the image
Figure 15 (c) Griewank
Percentage of success for the De Jong, Rosenbrock, and Griewank functions for the mixed GPSO algorithm.

8. Application to an Environmental Real Case

The vertical electrical sounding (VES) is a geophysical direct current technique aiming to characterize the depth-variation of the resistivity distribution of a stratified earth. The VES methodology is as follows.

  • (i)

    On the surface, at each measuring station, two current electrodes and two potential electrodes are symmetrically laid at both sides of a fixed central point.

  • (ii)

    In successive stations, the two external current electrodes are moved apart from each other, increasing their mutual distance, while holding the two inner potential electrodes fixed in place at a much shorter distance. At each position of injection the voltage difference, ΔV, is then measured.

The inverse problem consists in estimating the conductivity and thicknesses of the stratified earth that explains the voltage measurements made on the surface. One, then, needs to define a misfit function to quantify the distance between observations and predictions issued from the earth models. This geophysical inverse problem is nonlinear and ill-posed, that is, different sets of earth models may give similar voltage predictions. In fact, the VES objective function has the minima in a very narrow and flat valley [14], similar to the Rosenbrock case in several dimensions.

Figure 16 shows the convergence curves (logarithmic error versus iterations) for a VES geophysical inverse problem having an important environmental application (salt intrusion in a coastal aquifer), using different GPSO versions. The cost function is in this case defined in 11. As it can be observed, GPSO with Δt = 0.9  performs better than PSO from the first iterations. The mixed GPSO algorithms perform more slowly, nevertheless, it is the PSO variant that reaches the lower misfit. These results obviously depend on the search space size.

Details are in the caption following the image
Vertical electrical sounding case (optimization in 11). Logarithmic convergence curves for Δt = 0.9,  1.0, 0.8–1.2.

9. Conclusions

A new PSO algorithm, generalized PSO, has been presented and analyzed. It comes from the continuous PSO model by considering time steps different from the unit (PSO). Its deterministic and stochastic stability regions and their respective asymptotic velocities of convergence are also a generalization of those of the PSO case. A comparison to PSO and to the continuous model is done, confirming that GPSO properties tend to the continuous as time step decreases. The size of the time steps is thus presented as a numerical constriction factor. Properties of the second-order moments—variance and covariance—serve to propose some promising parameter sets. High variance and temporal uncorrelation improve the exploration task while solving ill-posed inverse problems.

Comparisons between PSO and GPSO, by means of some numerical experiments using well-known benchmark functions and real data issued from environmental inverse problems, show that generalized PSO is more effective than PSO at accurately locating the global minimum of these cost functions. Based on this analysis, we propose two new time-step adapted PSO variants, GPSO, and mixed GPSO that improve the convergence performance of PSO in a geophysical inverse problem in 11.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.