A Kinetic Approach to Relativistic Shocks in Astrophysics
Abstract
Stochastic acceleration of charged particles across highly relativistic shock is often considered as the main source for observed emission. Here is shown that the derivation of the appropriate quasilinear equation describing particle transport across such shocks depends on the assumptions made for the power spectra in the upstream region ahead of the shock. For both an ambient magnetic field perpendicular to the shock front and for an oblique magnetic field derivation is given of the quasilinear diffusion equation for particle transport appropriate to both sides of the shock. There is both pitch angle diffusion and energy diffusion; the relative strengths of the two processes depends on the assumptions made concerning the upstream wave power spectra. Transformations of the diffusion equations into the frame where the shock is stationary are given for the upstream and downstream regions including both energy diffusion and pitch angle scattering. The remaining outstanding concern is the determination of the continuity of the transport equations across the shock. This latter problem has yet to be solved fully in even the simple case of assumed pitch angle scattering only. Including energy diffusion and pitch angle scattering presumably makes the determination of the correct continuity behaviour more difficulty.
1. Introduction
Highly collimated, highly relativistic, bulk flows from the cores of Active Galactic Nuclei (AGN) are about the only method known that will allow one to account for the high γ-ray brightness observed in conjunction with the extremely rapid temporal fluctuations in brightness. Such bulk flows (with Lorentz Γ values in the several hundred regime) imply massive acceleration processes in the AGNs in order to produce the beams. It is to be doubted that such a production would not also energize individual particles so that they, too, would have thermal energies with Lorentz factors in the several hundred region for protons and even higher for the much less massive electrons.
While the mechanisms for production of such beams remain to be worked out in detail, such beams exit into the interstellar, or intergalactic, medium of each AGN, and so produce relativistic shocks. On the upstream side of such a shock one has the cool interstellar medium, while on the downstream side one has the shocked, highly relativistic plasma. Viewed from a frame of reference in which the shock is stationary, the upstream particles approach the shock at a high Lorentz factor. The question of some considerable astrophysical interest is to describe the kinetic behaviour of waves and particles taken together under such a scenario with as few assumptions as possible so that one has a generic procedure for investigating consequences of such shocks. This question is of considerably more than just pure academic interest. The review by Kirk and Duffy [1] emphasizes the fact that, to date, the description of such relativistic shocks has been handled as an MHD problem with, therefore, a prescribed equation of state and with test particle diffusion only. Indeed, the corresponding diffusion coefficient is kinematically given and, so far, appears to involve only pitch angle scattering in a momentum independent manner.
Now we know from other astrophysical objects, such as supernovae remnants, that upstream and downstream wave generation plays a fundamental role in modifying the behaviour of particles that cross-shocks so that a full kinetic description is needed. There is no reason to suppose that similar modifications to the MHD picture for relativistic shocks associated with AGNs will not also produce major alterations to the treatments available to date. And, even if there is only a slight modification produced to the MHD picture by a more correct kinetic picture, then the conditions under which one can be sure the MHD picture is appropriate are then more sharply defined. In addition, use of an ad hoc diffusion coefficient, not tied to the wave spectra, makes it difficult to investigate the relevant particle acceleration or deceleration. And the justification for using a diffusion coefficient in the first place is based upon the quasilinear theory for homogeneous plasmas. The presence of the shock modifies the patterns of behaviour from those obtaining in a rigorously homogeneous medium so that one must describe the general problem de novo.
Accordingly, there are excellent astrophysical and physical reasons for considering the basic problem anew from a kinetic particle Vlasov type of approach. It is this development that is considered here.
There are two basic conditions that can exist for steady-state shock propagation with respect to an upstream uniform ambient magnetic field. One can find a frame of reference for which the shock is stationary and for which one has an upstream magnetic field at the de Hoffman-Teller angle with respect to the shock front. Alternatively, one can find a frame of reference in which the shock is stationary and the ambient magnetic is perpendicular to the shock front. As noted by Kirk and Duffy [1], the de Hoffman-Teller reference frame is likely not to be of much direct application to the problem of highly relativistic beams because the allowable angles for the upstream magnetic field, as viewed from the rest frame of the shock, are typically O(1/Γ) with respect to the shock front plane. Thus, the probability is that the remaining part of the angular regime ahead of the shock [O(π − 1/Γ)] is the most appropriate situation, for which the ambient magnetic field can always be arranged to be perpendicular to the shock front. It is precisely that situation that is discussed directly; the case where the ambient magnetic field is not so describable will form the second major part of this paper.
2. Part I—Perpendicular Ambient Magnetic Field
There are several background information categories that are extremely useful in handling the problem of particle transport in a relativistic shock regime. This section of the paper sets up the nomenclature so that the development of the corresponding transport equations (to be handled in the next section) is facilitated. The categories are: Definitions in frames of references; Relevant Lorentz transformations and symbolism within the various frames of reference, Fields and Distribution Function structures in the different reference frames; and General operator structure of the ensemble-averaged transport equations. Consider each in turn.
2.1. Frames of Reference
Three different frames of reference are extremely useful in handling the basic problem of particle behaviour coupled to field fluctuations. First is the frame of reference in which the shock is stationary (this frame of reference is denoted S). Second is the frame of reference in which the upstream particles have zero bulk velocity (this frame of reference is denoted U). Third is the frame of reference in which the downstream particles have no bulk velocity (this frame of reference is denoted D). Figure 1 shows a sketch of the situation in the shock frame of reference S.

For particles of charge e, rest mass m, under the influence of fluctuating electric (E) and magnetic fields (B), and also under the influence of the uniform magnetic field B0 that points in the z direction, one can write the particle distribution function, f, for each species of particles as the solution to the Vlasov equation
In addition to (1), one has the corresponding Maxwell equations:
Interest centers on determining the downstream distribution functions together with the downstream electric and magnetic fields, and the downstream bulk velocity, UD, given the upstream distribution functions far from the shock, given the upstream electric and magnetic fields, and also given the upstream flow velocity UU. Both UD and UU are in the z direction and, based on the sketch of Figure 1, can be written as UD = −1zUD and UU = −1zUU where UD and UU are positive constants as measured in the frame S.
2.2. Lorentz Transformations
Denote by pU,z the z component of momentum of a particle measured in the upstream frame U. Then in the frame S of the shock one measures pS,z with the connection
For angular frequency ω and vector wavenumber k one has the transformation pair
For the electric and magnetic fields, the connections are that
Note further that a wave phase Φ = k · x − ωt is a Lorentz invariant so that if one were to take the Fourier transform of (7a)–(7c) then
Hence if the upstream components of the electric and magnetic fields are known in the frame U as functions of kU and ωU, then one can immediately write down the corresponding components as measured in the frame S. The corresponding relations hold between the frames S and D. In addition one has the invariant d3kUdωU = d3kSdωS.
2.3. Distribution Function Considerations
Because the plane z = 0 is taken to be the steady-state shock front as measured in the frame S, some care has to be taken in setting up and solving the Vlasov equation to obtain the downstream behaviour of particles and waves in relation to the prescribed (on z = ∞) input of particles and waves.
The general argument operates as follows. First, on z = ∞ one specifies the upstream distribution functions and all electric and magnetic fields including the wave spectrum. As the upstream particles are convected to the shock front (at a velocity −1zUU ) measured in the frame S, both the known far upstream particle distribution functions plus wave spectrum are altered near the shock by the multiple shock crossings the particles can make and also by the resulting downstream particles and fields.
Second, the downstream particles at z = −∞ are taken to have a steady-state component to their distribution functions plus a fluctuating component, both as measured in the frame S. The “jump” conditions across the shock at z = 0 should then provide the necessary conditions to connect upstream and downstream behaviors, and should also relate the upstream modulation of the particles and waves to the shock reflection and transmission properties.
In order to obtain equations allowing the connections to be made, it is conventional to make a weak turbulence argument (to be described in detail below). However, note that one cannot assume that the background medium is homogeneous everywhere (usually one of the linchpins that permits derivation of a convection-diffusion model from the weak turbulence assumption) because of the presence of the shock. It is, therefore, appropriate to consider the relevant quasilinear equations for the shock-influenced distribution functions in the upstream, downstream, and shock frames, including all assumptions and approximations made.
2.3.1. Formal Manipulations
The formal path to obtaining a quasilinear equation for the time-independent (in the shock frame S) component of the distribution function operates as follows. In the Vlasov equation (1) set fS = FS + δfS where FS is the time independent component and δfS is the time dependent component of the distribution function. Then, quite generally, one has
The canonical assumption made in deriving the quasilinear behaviour of FS is that only the first terms on the right hand side of (10) need to be retained so that
Now in the case of a planar shock the spatial dependence of FS can be a function of z only and not of spatial coordinates parallel to the shock front so that (14) takes the simplified form
One further approximation to the general quasilinear equation is to be concerned only with the component of the quasilinear equation that is independent of the phase angle the particles make around the mean magnetic field B0, so that one singles out just that component from the general equation (15), which depends on the particle momentum, the pitch angle of the particles, and on the spatial coordinates.
Then one writes
The general structure of (17) is of the form
2.3.2. Conditions for Different Frames of Reference
(1) The Fields Because the coefficients A, B, C and D in (18) depend on the electric and magnetic fields fluctuations as measured in frame S (see Appendix B), as well as on pz and p⊥, also measured in the frame S, and because the electric and magnetic field fluctuations are considered known far upstream from the shock and far downstream from the shock in the frames U and D (where there is no bulk motion of the plasma in the upstream or downstream regions, resp.) one needs to write the components E and B, as measured in frame S, in terms of the field components measured in the frames U and/or D. Such is accomplished with the standard transformations given in (7a) through (7c). Accordingly, in the frame S the corresponding components of the power spectra tensor for the electromagnetic fields can be written down in terms of the equivalent values measured in either the upstream or downstream frames U and D, respectively. And, again in turn, such means one can write the four coefficients A, B, C, and D in terms of the upstream and downstream wave spectra measured in the rest frames U and D, and in terms of pz and p⊥ together with the bulk transport speeds UU and UD.
(2) The Distribution Function (i) General Remarks. It is often assumed that in either or both of the frames U and D (i.e., far upstream or far downstream from the shock) the distribution function is effectively isotropic with changes from isotropy being brought about in the vicinity of the shock by the shock passage through the plasma medium. More generally, in either of the frames U or D the distribution function must be a symmetric function of the z component of momentum measured in the frame because in neither of the frames U or D does the plasma have a bulk speed.
In the frame S the distribution function cannot be symmetric in the z component of momentum measured in the frame S because particles far upstream must approach the shock at a speed UU and, far downstream of the shock particles must recede from the shock at a speed UD. The distribution function, F0(z, p⊥, pz), measured in terms of quantities in the shock frame S, must have an anisotropic component. In the frames U or D the plasma has no bulk speed and so can be a function only of z, p⊥, and as measured in the frames U or D. Thus this sort of condition requires that one write the distribution function as measured in the shock frame in terms of the momentum components measured in the frames U and/or D, respectively. This aspect is now considered.
(ii) Frame of Reference Considerations for the Distribution Function Dependence. Denote by subscript U or D the momentum components measured in frames U or D respectively so that, for example, pU,z refers to the z component of momentum measured in the upstream frame U. Then F0(z, p⊥, pz) must be a function of the form
Thus in both the upstream and downstream frames F0 takes on different functional behaviours due to UD and UU being different.
The corresponding diffusion (18) can then be written either as is, but then the diffusion coefficients that are given in the frames U and D must be transformed to coordinates appropriate to the shock frame and one must also use the fact that F0 is represented solely as expressed in (19), or can transform the diffusion equation to “mixed” momentum coordinates p⊥, pU,z and then one has the diffusion coefficients given in the frames U or D together with the representation of F0 as a function of p⊥, . Equally, in the downstream frame D one can write a similar expression with the replacement of subscripts U for subscripts D. While such a mixed momentum coordinate system simplifies enormously a discussion of the basic diffusion equation, the transformation to such a mixed momentum coordinate system also complicates the conditions the distribution function must obey at the shock z = 0.
Written in terms of the mixed momentum coordinates p⊥, pU,z and with wz = pU,z/(mγ0), w⊥ = pU,⊥/(mγ0), the diffusion equation (18) takes on the form
3. Conditions across the Shock at z = 0
At the stage where one has written down the diffusion (21) in terms of the mixed momentum coordinates, the precise structure of the four diffusion coefficients, A, B, C, D, becomes important in respect of their dependences on p⊥ and pU,z (or pD,z); the dependences are also influenced by the connections one assumes for the waves doing the scattering for these waves determine the interaction of electric and magnetic fields with the particles.
Accordingly, the general solution of (21), and the matching of conditions across the shock depend on precisely what one assumes for the fluctuating waves in both the upstream and downstream regions. In a general sense several factors stand out. First note that (21) involves diffusion in both parallel and perpendicular momentum and also note that that the respective diffusion coefficients are, in general, different. Thus one cannot have only pitch angle diffusion without also having energy diffusion. Second note that the respective diffusion coefficients are dependent on p⊥ and pU,z (or pD,z) and on the bulk velocity on each side of the shock. Thus the four diffusion coefficients are prescribed once the wave spectra entering the definitions of A through D are given. One cannot, arbitrarily, assume particular functional forms for A through D in order to achieve some preconceived and desired result because any such choice must be shown to be consistent with the power spectral information that is allowed for individual wave types. Third, note the presence of two sorts of mixed terms in (21) involving cross derivatives with respect to p⊥ and pU,z (or pD,z ), with a difference in coefficients that occurs pre and post the partial derivative operators, so that significant differences ensue, particularly in the highly relativistic regime where (1 − Uwz/c2) ≈ 0.
4. Part II—Oblique Ambient Magnetic Field
As noted in Part I Kirk and Duffy [1] have stated that the de Hoffman-Teller reference frame is likely not to be of much direct application to the problem of highly relativistic beams because the allowable angles for the upstream magnetic field, as viewed from the rest frame of the shock, are typically O(1/Γ) with respect to the shock front plane. Thus, the probability is that the remaining part of the angular regime ahead of the shock [O(π − 1/Γ)] is the most appropriate situation, for which the ambient magnetic field can always be arranged to be perpendicular to the shock front. Precisely that situation was the focus of Part I; the case where the ambient magnetic field is not so describable will form the subject of this section.
Consider then that the shock makes an angle θ with respect to a background magnetic field, of magnitude BU as measured in the upstream rest frame, as sketched in Figure 2. Then resolve the background magnetic field into components perpendicular and parallel to the shock front with BU,z = BUcos θ, and a parallel component (taken without loss of generality to point in the x-direction) with BU,⊥ = BUsin θ. In the shock frame S one then has the measured field components BS,⊥ = ΓUBUsin θ, and BS,z = BUcos θ. Define an angle ς by tan ς = BS,⊥/BS,z = ΓUtan θ so that for highly relativistic shocks (ΓU ≫ 1) the angle ς will be almost π/2 unless θ ≪ 1/ΓU, which is the usual division between “parallel” and “perpendicular” regimes. Note also that in the shock frame S there is now a background electric field given through ES,⊥ = ΓUUUxBU/c, provided there is no background electric field in the frame U.

This difference between the situation of a background magnetic field exactly perpendicular to the shock and one making an angle θ (as measured in the frame U) means that, should one wish to obtain a diffusion equation for particle behaviour in the presence of fluctuating electromagnetic fields, then it is simplest to operate in the upstream frame U where only a background magnetic field is present and not a background electric field, and then to transform the final results to the shock frame S. However, the price that must be paid for such a simplification of the equations is the complexity of the boundary conditions that, in the upstream (downstream) frame U(D), are now time-dependent. The reason is that, as seen in the frame U, the shock moves through the upstream plasma at speed UU in the z-direction. The strategic question is, then, whether it is simpler to work in the frame U with complex boundary conditions that are also time-dependent, or whether it is simpler to work in the frame S with more complex equations involving background magnetic and electric fields as measured in S but with static boundary conditions. We have found it simpler to work with static boundary conditions and so for the remainder of this section of the paper the development of the quasilinear equation will be considered in the frame S.
If the upstream components of the fluctuating electric and magnetic fields are known in the frame U as functions of kU and ωU, then one can immediately write down the corresponding components as measured in the frame S. Corresponding relations hold between the frames S and D. In addition one has the invariant d3kUdωU = d3kSdωS.
The formal path to obtaining a quasilinear equation for the time-independent (in the shock frame S) component of the distribution function operates as follows. In the Vlasov equation (but with the addition of the background electric field ES in the shock frame) set fS = FS + δfS where FS is the time independent component and δfS is the time dependent component of the distribution function. Then, quite generally, one has
The canonical assumption made in deriving the quasilinear behaviour of FS is that only the first terms on the right hand side of (24) need to be retained so that
Equation (11) has the solution
Now in the case of a planar shock the spatial dependence of FS can be a function of z only and not of spatial coordinates parallel to the shock front so that (14) then takes the simplified form
One further approximation to the general quasilinear equation is to be concerned only with the component of the quasilinear equation that is independent of the phase angle the particles make around the mean magnetic field measured in the upstream coordinates, so that one singles out just that component from the general (29), which depends on the particle momentum, the pitch angle of the particles, and on the spatial coordinates. This process is technically involved when the upstream magnetic field is other than perpendicular to the shock front. Specifically, the distribution function component of interest can then be a function of only and where
In addition, it has been known since the work of Hudson [2, 3], over forty years ago, that the case of an uniform magnetic field making an angle with respect to the shock front produces extremely different behaviours for the particle distribution function than in the case of an exactly perpendicular magnetic field even in the absence of electromagnetic fluctuations. The difference arises because, for an oblique magnetic field, the phase angle motion of the particles around the ambient magnetic field is not precisely parallel to the shock front. Particles then intersect the shock front preferentially on one side or the other of the oblique direction, leading to preferential energy and momentum variations—something that does not occur at all for a magnetic field precisely perpendicular to the shock front. So even in the case where electromagnetic fluctuations are ignored, or are absent, the symmetry breaking of the oblique ambient magnetic field produces variations of the transmitted and reflected particles from the shock front that are not recoverable from the situation when one deals with a magnetic field exactly perpendicular to the shock front. Including electromagnetic waves complicates the situation even further, of course.
In short, the distribution function for the particles must have a phase-dependent component that is of fundamental significance unlike the exactly perpendicular magnetic field situation where one can ignore completely any such variation on symmetry grounds (at least at the quasilinear level of approximation) as shown in Part I.
In order to preserve the asymmetric effect of the oblique background magnetic field and, at the same time, to handle the corresponding electromagnetic wave scattering of particles, a procedure that allow for both is the following.
In the frame where the shock is stationary write the operator
The initial/boundary conditions on F0 and G are chosen so that F0 represents the influence of solely the background fields in perturbing the distribution function, while G is chosen so that if the electromagnetic fluctuations are set to zero then G is chosen to be zero also. This separation serves to show the influence of the two sorts of fields on the distribution function and also enables one to handle the basic problem in the shock frame S without the complex evaluation that would otherwise be necessary as referred to above.
Because one has considered fluctuating electromagnetic field correlations of no higher order than pair-wise in the derivation of the basic quasilinear equation as indicated above, the same assumption is appropriate here so that, to the same order, one can set the exponential factors involving expG (or exp − G) to unity in the right hand side of (33). One then has
There is no good reason to suppose that the full boundary problem, including electric field fluctuations should be any easier—although future numerical calculations may point the way to a good approximation scheme. Such procedures would also have to incorporate the basic nonperpendicular particle kinetic development given here.
Two tasks remain to carry through in detail. First, one must evaluate the structure of the operator fields 𝒲−1 and ℳ−1 as they act on the fluctuating electromagnetic fields, so that the spectral types of electromagnetic fields need to be specified in both the upstream and downstream regions. Second, one must then apply the boundary conditions at the shock front plane (z = 0) to both the upstream and downstream components and, simultaneously, effect a smoothly continuous transition for the distribution function as one crosses the shock.
5. Part III—Discussion and Conclusion
A complete solution of the distribution function equation in the cases of either a perpendicular (tothe shock) ambient magnetic field or an oblique ambient magnetic field across the shock would seem to require a sophisticated mixture of analytical procedures in conjunction with equally sophisticated numerical methods. Indeed, in even the seemingly simple case of a postulated scattering in pitch angle alone in the presence of a perpendicular ambient magnetic field, Kirk and Schneider [5] and Kirk and Duffy (1999) had to resort to left-handed and right-handed eigenfunctions to effect even an approximate solution and the final approximate analysis had then to be performed numerically. An investigation of this pitch angle scattering situation has been given in considerable generality by Vietri [6] and Blasi and Vietri [4] and represents the most advanced solution of the shock diffusion equation available to date under the pitch angle scattering assumption.
In a more general sense the following statements have to be adhered to far from the shock and at the shock. First: the perception of the dependence of the distribution function far from the shock in both the upstream and downstream regions (in respect of its variability with momentum components parallel and perpendicular to the shock front) is a crucial ingredient for such a prescription determines the behaviour of the distribution function as one nears the shock. But once this rather general condition is met, for solution of the diffusion equation across the shock, so that one can relate upstream conditions to downstream conditions, further factors need to be satisfied. Second, therefore, is the requirement that the distribution function be continuous across the shock coupled with the requirement that the distribution function be a function only of z, p⊥ and (or ) in the upstream (downstream) regions in z > 0 (z < 0). If this second condition is satisfied then the flux of particles across the shock is automatically satisfied as well.
Then two problems remain. Third, that one can actually effect solutions to the diffusion equation on each side of the shock, which solutions depend on the assignment given for the wave power spectra in the upstream and downstream regions, although once one has specified, say, the upstream wave spectra then the downstream wave spectra are determined-at least in principle. Fourth, that one can actually construct a match of the distribution function across the shock at z = 0 from the upstream to downstream regions with the correct continuity of the distribution function. It is the combination of these last two problems that produces considerable technical difficulty, as was already clear from the limited treatment presented in Kirk and Schneider [5] and given most generally by Vietri [6] and Blasi and Vietri [4] in the case of assumed pitch angle scattering only without connection to possible wave spectra. Even then, considerable difficulty ensued in obtaining a matched, smooth solution for the distribution function across z = 0, the shock plane location.
The major point to be made here is that, prior to investigation of the jump and continuity conditions across the shock, one must specify clearly the appropriate quasilinear distribution function equation that is valid on each side of the shock. Such a description requires that one be specific about the types of electromagnetic disturbances that are taken to exist upstream of the shock and with continuation of the electric and magnetic field components across the shock. In addition, once such a specification is made one must ensure that the evolution of the distribution function is correctly taken into account within the framework of such field distributions. In general, such a specification implies that there will be energy diffusion of the particles as well as pitch angle scattering-unless the magnetic field fluctuations upstream of the shock are carefully enough arranged to discount such energy diffusion. For fast magnetosonic waves, Alfven waves, and slow magnetosonic waves upstream of the shock there will be energy diffusion. But once one has obtained the appropriate diffusion equation either with perpendicular or oblique ambient magnetic fields-and such a description has been the purpose of the present paper- there remains the formidable problem of addressing the continuity of the distribution function across the shock. There seems to have been little headway achieved with this latter problem as of the present time, although the work of Vietri [6] and Blasi and Vietri [4] provides illustrations of the complexity of the continuity problem even for the simple case of pitch angle scattering only.
The determination of the appropriate diffusion equation for the particle distribution function is, in its own rite, complicated enough, as this paper has made clear. The shock continuity problem remains an outstanding challenge.
Acknowledgments
The award of a Mercator Professorship (IL) enabling this work to be undertaken is most gratefully recognized. This work was partially supported (RS) by the Deutsche Forschungsgemeinschaft (DFG) through Grant Schl. 201/19-1.
Appendices
A. The Inverse Operator ℒ−1
From the main text one has
Now the background magnetic field B0 is only in the z direction with strength B0. So introduce coordinates Ω = eB0/(mcγ) with , and where px = p⊥sinφ, py = p⊥cos φ. It then follows that, after simple Fourier transforms in x and t, one can write
B. Simplification of the Right Hand Side of (17)
The main factor needing simplification on the right hand side of (17) is
In order to expedite evaluation of (B.1) it is most convenient to write the fluctuating fields E and B as Fourier transforms with
In the conventional quasilinear approximation it is also taken that FS may depend on pz and p⊥ but not on the phase angle. Under such conditions, one can then write
One now invokes a stationary statistical nature for the fluctuations in the electric and magnetic fields to write
Then one can write
The various integrals over φ in (B.6) can be readily, if somewhat tediously, performed. The upshot is that the equation describing the evolution of FS takes on the form