Recently, we present a novel Mastrovito form of nonrecursive Karatsuba multiplier for all trinomials. Specifically, we found that related Mastrovito matrix is very simple for equally spaced trinomial (EST) combined with classic Karatsuba algorithm (KA), which leads to a highly efficient Karatsuba multiplier. In this paper, we consider a new special class of irreducible trinomial, namely, x^m + x^m/3 + 1. Based on a three-term KA and shifted polynomial basis (SPB), a novel bit-parallel multiplier is derived with better space and time complexity. As a main contribution, the proposed multiplier costs about 2/3 circuit gates of the fastest multipliers, while its time delay matches our former result. To the best of our knowledge, this is the first time that the space complexity bound is reached without increasing the gate delay.

1. Introduction

Efficient hardware implementation of the finite field arithmetic, especially for GF(2^m), is frequently desired in coding theory and public-key cryptosystems [1, 2]. Among these arithmetic operations in GF(2^m), multiplication is of the most importance, as other complicated field operations such as exponentiation and inversion can be carried out by iterative multiplications. Thus, it is necessary to design efficient multiplier.

The field elements are usually represented by a certain basis such as polynomial basis (PB), normal basis (NB), and dual basis (DB). In PB representation, the multiplication consists of multiplying two polynomials and reducing the result modulo an irreducible polynomial. The choice of such an irreducible polynomial is critical to perform the reduction operation efficiently. Irreducible trinomial is one of the most common considerations [3, 4]. During recent years, many bit-parallel multipliers using PB representation are proposed for GF(2^m) defined by irreducible trinomials, some of which can be found in [3, 5–8]. The efficiency of the architecture is always evaluated by space and time complexity. The former one is expressed in terms of the number of logic gates (XOR and AND) and the latter one is expressed in terms of the sum of XOR and AND gates delay of the critical path. Among these multipliers, the fastest bit-parallel multipliers nowadays are proposed by Fan and Hasan [9] and Hariri and Reyhani-Masoleh [10]. If GF(2^m) is defined by f(x) = x^m + x^k + 1, 1 < k ≤ m/2, the corresponding multiplier requires m² AND and m² − 1 XOR gates with time delay T_A + ⌈log₂(2m − k)⌉T_X (for good fields, the time delay is T_A + ⌈log₂m⌉T_X), where T_A and T_X are the circuit delay of one AND gate and one XOR gate, respectively. Except for these multipliers for general trinomials, there are also several proposals for special types of irreducible trinomials [11–13]. These multipliers usually utilize the special form of the trinomial to obtain efficient implementation.

The Karatsuba algorithm (KA) works recursively by breaking down one big multiplication into two or more submultiplications. It is a typical divide-and-conquer algorithm. Please note that the classic KA starts with a way to multiply two 2-term polynomials using three scalar multiplications. Some other variations are also investigated. More details can be found in [14–16]. The KA can be adopted to design subquadratic complexity multiplier [14, 17] or hybrid multiplier [18, 19]. Specially, there is another type of hybrid multiplier, namely, nonrecursive Karatsuba multiplier, which only applies KA once in the polynomial multiplication [8, 20]. These multipliers regularly require 3/4 circuits gates compared to the fastest bit-parallel multipliers, while its time delay increased by a small number of T_X. For example, Elia et al. [8] costs at least two more T_X.

Recently, we proposed a novel nonrecursive Karatsuba multiplier that is based on Mastrovito approach [21]. It is shown that our multiplier only requires one more T_X compared with the fastest multipliers [9, 10]. However, it costs a few more logic gates than Elia′s result. Except for the nonrecursive Karatsuba multiplier for general trinomials, Shen and Jin [13] proposed a new Karatsuba multiplier that fully exploited equally spaced trinomial and the classic KA to simplify the modular reduction. Consequently, the space complexity of their scheme matches Elia′s result. Meanwhile, the time complexity is T_A + (1 + ⌈log₂(m − 1)⌉)T_X, which is roughly equal to the fastest results. Furthermore, we observe that the special case m = 2k of our multiplier coincides with their scheme. (Here, the trinomial x^m + x^k + 1 (m = 2k) is an equally spaced trinomial.)

In this paper, we explore another special case of our former scheme to obtain even more efficient nonrecursive Karatsuba multipliers. Our main idea is analogous to Shen and Jin [13], where a special type of trinomials and a KA variation are utilized to simplify the structure of corresponding Mastrovito matrix. More explicitly, we consider the irreducible trinomial x^m + x^m/3 + 1 and a three-term Karatsuba algorithm. It is demonstrated that the corresponding Mastrovito matrix can be simplified further under this condition. The shifted polynomial basis (SPB) [4] is also utilized to reduce the critical path delay further. Consequently, we proposed a bit-parallel multiplier that costs approximately 2/3 circuit gates of the fastest bit-parallel multipliers. On the other hand, the time complexity is T_A + ⌈log₂(8m/3)⌉T_X, which almost matches the best known results.

The rest of this paper is organized as follows: In Section 2, we briefly review the Mastrovito approach based on SPB representation and some relevant notions. Then we introduce a three-term KA formula and investigate the structure of related Mastrovito matrix. A new bit-parallel multiplier architecture is then proposed in Section 3. Section 4 presents a comparison between the proposed multiplier and some others. Finally, some conclusions are drawn.

2. Preliminary

In this section, we briefly review some related notations and algorithms used throughout this paper. Consider the finite field GF(2^m) generated with an irreducible trinomial x^m + x^m/3 + 1. Let x be a root of x^m + x^m/3 + 1 and the set M = {x^m−1, x^m−2, …, x, 1} constitute a polynomial basis (PB). Therefore, every element of GF(2^m) can be represented as a polynomial over of degree less than m. The shifted polynomial basis (SPB) is a variation of the polynomial basis, which is obtained by multiplying the set M by certain exponentiation of x.

Definition 1 (see [4].)Let v be an integer and the ordered set M = {x^m−1, …, x, 1} be a polynomial basis of GF(2^m) over . The ordered set x^−vM≔{x^i−v∣0 ≤ i ≤ m − 1} is called the shifted polynomial basis with respect to M.

Generally speaking, the optimal choice of v for irreducible trinomial is equal to the middle term degree or it minus one [4]. In this case, we have v = m/3 and use this denotation thereafter. It follows that the field element A ∈ GF(2^m) can be expressed with respect to SPB as follows:

(1)

Given two elements of GF(2^m) under SPB representation, that is,

, the field multiplication can be performed as

(2)

Obviously, the product D = AB is thus equal to

(3)

Analogous to ordinary polynomial multiplication, this product can be computed by a matrix-vector multiplication d = A · b, where b, d express the coefficient vectors of B(x) and D(x), and the matrix A is given by

(4)

The difference between the above matrix and the usual PB case [3] is simply the labels of the lines in left side, which indicate the exponent of indeterminate x for each line.

We then reduce the above matrix in view to obtain the field product expressed in SPB representation. The reduced matrix, denoted by M, is called Mastrovito matrix. Thus, the SPB field multiplication is rewritten as

(5)

where c denotes the coefficient vector of C(x). The structure of M relies on A and the modular reduction rule. In this case, we should obey the following reduction rule:

(6)

However, if we directly reduce the product matrix presented in (4) using the above formulae and perform matrix-vector multiplication, there is no difference between this computation and the general case. In the following section, we will construct a new Mastrovito matrix using a three-term Karatsuba algorithm and describe a highly efficient bit-parallel multiplier.

Moreover, one can check that the irreducible trinomial in the form of x^m + x^m/3 + 1 exists when m = 3 × 7ⁱ where i is a nonnegative integer [1]. Although the number of this type of irreducible trinomials is not that abundant, there still exist some trinomials in the range of interest for practical application.

In the end, we also introduce some notations pertaining to matrices and vectors, which are already proposed in [21, 23] and extensively used throughout this paper.

(i)
Z(i, :) represents the ith row vector in matrix Z;
(ii)
Z(:, j) represents the jth column vector in matrix Z;
(iii)
Z(i, j) represents the entry with position (i, j) in matrix Z.

3. Mastrovito Multiplier Using a Three-Term Karatsuba Algorithm

The Karatsuba algorithm [2] has been applied to improve the efficiency of bit-parallel multiplier for GF(2^m) generated by an AOP [20] and a trinomial [8, 13, 21]. It starts with a way to multiply two two-term polynomials using three scalar multiplications which can reduce the space complexity of the multipliers by approximately a factor of 3/4. Besides the classic algorithm, there exist several generalizations with respect to the Karatsuba algorithm [14–16]. Here, we are only focus on a simple Karatsuba algorithm variation, three-term Karatsuba algorithm, which multiplies two three-term polynomials using six scalar multiplications. Given two three-term polynomials in

, one can check that

(7)

In general, the Mastrovito multiplication utilizing the KA will increase the time complexity. Our former result shows that a Mastrovito multiplier using classic KA costs one more T_X than the fastest ones. However, some literature sources [13] indicated that this result would be further improved for some special cases, for example, the EST x^m + x^m/2 + 1. In the following, we will show that for the trinomial x^m + x^m/3 + 1, applying the three-term Karatsuba-like formula will also simplify the reduction operation and lead to fast implementation.

Let f(x) = x^m + x^m/3 + 1 be an irreducible trinomial and

be two field elements in SPB representation. We partition A, B into three parts, with each part consisting of m/3 bits. In order to simplify related expressions, we denote m/3 as k. Then,

(8)

where

, for i = 0,1, 2. Then we multiply A and B using the three-term Karatsuba-like formula and do the following transformation:

(9)

where C₂ = A₂ + A₁, C₁ = A₂ + A₀, C₀ = A₁ + A₀, D₂ = B₂ + B₁, D₁ = B₂ + B₀, D₀ = B₁ + B₀. We divide (9) into two parts,

(10)

and compute each part modulo f(x) independently.

3.1. Computation of S₁modf(x)

We first consider the computation of S₁ in detail. Note that S₁ actually consists of three different parts: A₀B₀, A₁B₁, A₂B₂ (others can be obtained by shift of these parts). When S₁ is rewritten as a matrix-vector form, we have

(11)

For simplicity, we do not write the labels of the product matrix here, which indicate the degree of xⁱ in S₁. Note that these degrees are in the range [−2k, 2m − 2k − 2]. In the above expression, b₀, b₁, b₂ represent the coefficient vectors of B₀, B₁, B₂, respectively. 0_k×k is a k × k zero matrix, A_i,L (i = 0,1, 2) are k × k lower-triangular Toeplitz matrices, and A_i,H (i = 0,1, 2) are k × k upper-triangular Toeplitz matrices. Please note that the matrix on the right side actually contains 6k = 6 · m/3 = 2m rows and the product matrix in fact contains 2m − 1 rows. However, the last row of the above matrix is 0, which does not affect the result. These submatrices have the following form:

(12)

for i = 0,1, 2. It is easy to check that the products S₁ contain the terms of degrees out of the range [−k, m − k − 1]; we have to perform the reduction operation for the product matrix in (21). According to Mastrovito scheme, the reduction can be regarded as the construction of product matrices from A using the reduction rule in (6). Denoted by M_A, the Mastrovito matrix is related to S₁. Then, we investigate the construction details for this matrix M_A. We have the following proposition.

Proposition 2. The Mastrovito matrix M_A can be constructed as

(13)

where

(14)

Proof. The proof is analogous with the proof of observation 3.1 in [21]. Note that the product matrix A contains 2m − 1 nonzero rows (the last row A(2m, :) is a zero vector), each of which corresponds to the polynomial degree from −2k to 2m − 2k − 2. It is easy to check that the first k rows and the last m − k − 1 rows correspond to the degrees that are out of the range [−k, m − k − 1]. Thus, we need to reduce these rows.

According to the reduction rule in (6), we have to reduce {−2k, −2k + 1, …, −k − 1} by adding them to the row {−k, …, −1} and {m − 2k, …, m − k − 1} and reduce the rows {m − k, …, 2m − 2k − 2} by adding them to the row {0, …, m − k − 2} and {−k, …, m − 2k − 2}. Obviously, the first k row here is [A_0,L, 0_k×k, 0_k×k] and the last m − k − 1 rows constitute

(15)

We compare the line number and obtain the result immediately.

Based on Proposition 2, we can compute S₁ as follows:

(16)

By swapping and combining some overlapped entries, expression (16) now can be rewritten as

(17)

We just compute two submatrix-vector multiplications and add them up to obtain S₁. Some tricks can apply to save more logic gates. We mainly utilized the computation strategy presented in [7] and fully considered the overlapped parts of the two above matrices. The computation can be divided into two steps:

(i)
Perform row-vector products:
(18)
in parallel. The symbol “∗” represents only row-vector product related to A_i,L (or A_i,H) and b_i, i = 0,1, 2. For example, A_0,H∗b₀ represents computing the inner product [A_0,H(i, 1) · b₀, …, A_0,H(i, k) · b_k−1], for i = 1,2, …, k in parallel.
(ii)
Sum up all the 2m entries of each row using binary XOR tree. Specially, consider some products of each row are zero; we compute the following summations:
(19)
using binary XOR tree firstly and then add these results together.

Remarks 3. It is easy to see that the row-vector products (18) contain all the possible row-vector products in (17). In addition, A_0,H, A_1,L, A_1,H, and A_2,L are all triangular matrices; one can easily check that each row of both [A_0,H, A_1,L] and [A_1,H, A_2,L] consists of at most k nonzero entries. After the computation of (18) and (19), certain number of XOR gates is required to obtain the final result. Table 1 summarizes the space and time complexity of S₁ for all the steps.

Table 1. Space and time complexities of S₁modf(x).

Operation	#AND	#XOR	Time delay
Inner products in (18)	3k²	-	T_A
Partial addition in (19)	-	3k² − 4k + 1	⌈log₂k⌉T_X
S₁modf(x)	-	4k − 1	2T_X

3.2. Computation of S₂modf(x)

Then we consider the computation of S₂modf(x) in detail. Since S₂ = (C₂D₂x^k + C₁D₁ + C₀D₀x^−k) and C_i, D_i(i = 0,1, 2) consist of k bits, we can follow similar line as the computation of S₁ to obtain the result. More explicitly, we rewrite S₂ in matrix-vector form:

(20)

Here, C_i,L (i = 0,1, 2) are k × k lower-triangular Toeplitz matrices and C_i,H (i = 0,1, 2) are k × k upper-triangular Toeplitz matrices, which are constructed from the coefficients of C₀, C₁, C₂ and are similar to A_i,L and A_i,H. Vectors d₀, d₁, d₂ represent the coefficient vectors of D₀, D₁, D₂.

The reduction of S₂ modulo f(x) is relatively simpler: we only need to eliminate the last k rows by adding them to the lines labeled with {−k, …, −2} and {0, …, k − 1}. Thus, we have

(21)

Analogous with the computation of S₁modf(x), we first perform row-vector products:

(22)

in parallel. Then, we compute the following summations:

(23)

using binary XOR tree firstly, and then add related results together. Please note that each row of [C_0,H∗d₀, C_1,L∗d₁] and [C_1,H∗d₁, C_2,L∗d₂] consists of at most k nonzero entries. We can calculate C_0,H · d₀ + C_1,L · d₁ and C_1,H · d₁ + C_2,L · d₂ in ⌈log k⌉T_X. Finally, we have to add all these summations to obtain the result. It costs 2k − 2 more XOR gates with one T_X delay. Related space and time complexities for the computation of S₂modf(x) are summarized in Table 2.

Table 2. Space and time complexities of S₂modf(x).

Operation	#AND	#XOR	Time delay
C₀, C₁, C₂	-	3k	T_X
D₀, D₁, D₂	-	3k	T_X
Inner products in (22)	3k²	-	T_A
Partial addition in (23)	-	3k² − 4k + 1	⌈log₂k⌉T_X
S₂modf(x)	-	2k − 2	T_X

From Tables 1 and 2, it is clear that the computations of S₁, S₂ modulo f(x) have the same time delay. So they can be implemented in parallel. Finally, another m XOR gates are needed to add the two results together, which also requires one T_X delay. As a consequence, the total space and time complexity of proposed architecture are

(24)

Furthermore, if m = 2ⁿ + c where c is smaller relatively to 2ⁿ⁻¹, we have ⌈log₂(8m/3)⌉ = 1 + ⌈log₂m⌉. In this case, the time delay of our architecture becomes T_A + (1 + ⌈log₂m⌉T_X), which is almost equal to the delay of the fastest bit-parallel multipliers [9].

4. Theoretic Comparison

Table 3 gives a comparison of different implementation methods of bit-parallel multipliers in the fields generated by trinomials x^m + x^m/3 + 1. From Table 3, we can see that our multiplier requires about 2/3 circuit gates compared with the previous architectures without using divide-and-conquer algorithm. On the other hand, the time complexity of the proposed multiplier is T_A + ⌈log₂(8m/3)⌉T_X, which is very close to the fastest result. In fact, we have checked this type of trinomials with degree m = 3 · 7ⁱ, i = 1,2, …, 1000, and found that there are 585 such trinomials reaching the bound T_A + (1 + ⌈log₂m⌉T_X) (others require only one more T_X).

Table 3. Comparison of bit-parallel multipliers for GF(2^m) generated with x^m + x^m/3 + 1.

Multiplier	#AND	#XOR	Time delay
Sunar and Koç [3]	m²	m² − 1	T_A + (2 + ⌈log₂m⌉)T_X
Wu [5]	m²	m² − 1	T_A + (2 + ⌈log₂m⌉)T_X
Wu [6]	m²	m² − 1	T_A + (2 + ⌈log₂m⌉)T_X
Fan and Dai [4]	m²	m² − 1
Elia et al. [8]			T_A + (3 + ⌈log₂m⌉)T_X
Negre [7]	m²
Fan [22] Type-A
Fan [22] Type-B			T_A + ⌈log₂(2m − 1)⌉T_X
Li et al. [21]
This paper

Note. 2^v−1 < m/3 ≤ 2^v and W(∗) is the hamming weight of the number.

In Table 4, we give a small example of field GF(2¹⁴⁷) defined by x¹⁴⁷ + x⁴⁹ + 1. It shows that, compared with other approaches, our architecture may be the best choice if the space and time complexity are both considered. In addition, compared with the fastest Karatsuba multiplier for general trinomials [21], it is argued that the space and time complexities can be reduced even further if special KA and irreducible polynomial are combined together.

Table 4. Complexity for practical field GF(2¹⁴⁷).

Basis	#AND	#XOR	Time
PB [3, 5]	21609	21608	T_A + 10T_X
PB [8]	16280	16838	T_A + 11T_X
SPB [4]	21609	21608	T_A + 9T_X
SPB [9]	21609	21608	T_A + 8T_X
SPB [7]	21609	27391	T_A + 8T_X
PB-CRT Type A [22]	21560	21560	T_A + 9T_X
PB-CRT Type B [22]	21560	21658	T_A + 9T_X
SPB [21]	16280	17394	T_A + 9T_X
SPB (this paper)	14406	14748	T_A + 9T_X

5. Conclusion

In this paper, a new Mastrovito multiplier architecture for trinomial of the form x^m + x^m/3 + 1 is proposed. We show that the space and time complexity of our former Mastrovito-Karatsuba multiplier can be further reduced for special form of trinomial combined with a KA variation. This multiplier can be used in some area-critical occasions because it has low space complexity but maintains a relatively low time delay. To find more polynomials which can use the proposed strategy will be the future work.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the Natural Science Foundation of China (nos. 61402393, 61601396) and Shanghai Key Laboratory of Integrated Administration Technologies for Information Security (no. AGK201607).

References

1 Lidl R. and Niederreiter H., Finite Fields, 1996, Cambridge University Press, New York, NY, USA, MR1429394.
10.1017/CBO9780511525926
Google Scholar
2 Knuth D. E., Art of Computer Programming, Volume 2: Seminumerical Algorithms, 1997, 3rd edition, Addison-Wesley Professional, MR0286318.
Google Scholar
3 Sunar B. and Koç Ç. K., Mastrovito multiplier for all trinomials, IEEE Transactions on Computers. (1999) 48, no. 5, 522–527, 2-s2.0-0032627015, https://doi.org/10.1109/12.769434, Zbl1231.68043.
10.1109/12.769434
Web of Science® Google Scholar
4 Fan H. and Dai Y., Fast bit-parallel GF(2ⁿ) multiplier for all trinomials, IEEE Transactions on Computers. (2005) 54, no. 4, 485–490, https://doi.org/10.1109/TC.2005.64, 2-s2.0-17644388075.
10.1109/TC.2005.64
Web of Science® Google Scholar
5 Wu H., Bit-parallel finite field multiplier and squarer using polynomial basis, IEEE Transactions on Computers. (2002) 51, no. 7, 750–758, 2-s2.0-0036647149, https://doi.org/10.1109/TC.2002.1017695.
10.1109/TC.2002.1017695
Web of Science® Google Scholar
6 Wu H., Montgomery multiplier and squarer for a class of finite fields, IEEE Transactions on Computers. (2002) 51, no. 5, 521–529, 2-s2.0-0036567372, https://doi.org/10.1109/TC.2002.1004591.
10.1109/TC.2002.1004591
Web of Science® Google Scholar
7 Negre C., Efficient parallel multiplier in shifted polynomial basis, Journal of Systems Architecture. (2007) 53, no. 2-3, 109–116, 2-s2.0-33846263056, https://doi.org/10.1016/j.sysarc.2006.09.004.
10.1016/j.sysarc.2006.09.004
Web of Science® Google Scholar
8 Elia M., Leone M., and Visentin C., Low complexity bit-parallel multipliers for GF(2^m) with generator polynomial x^m + x^k + 1, IEEE Electronics Letters. (1999) 35, no. 7, 551–552, https://doi.org/10.1049/el:19990407, 2-s2.0-0032689933.
10.1049/el:19990407
Web of Science® Google Scholar
9 Fan H. and Hasan M. A., Fast bit parallel-shifted polynomial basis multipliers in GF(2ⁿ), IEEE Transactions on Circuits and Systems I: Regular Papers. (2006) 53, no. 12, 2606–2615, https://doi.org/10.1109/TCSI.2006.883855, MR2370451.
10.1109/TCSI.2006.883855
Web of Science® Google Scholar
10 Hariri A. and Reyhani-Masoleh A., Bit-serial and bit-parallel montgomery multiplication and squaring over GF(2^m), IEEE Transactions on Computers. (2009) 58, no. 10, 1332–1345, https://doi.org/10.1109/TC.2009.70, 2-s2.0-70349592322.
10.1109/TC.2009.70
Web of Science® Google Scholar
11 Choi Y. J., Chang K.-Y., Hong D. W., and Cho H. S., Hybrid multiplier for GF(2^m) defined by some irreducible trinomials, IEEE Electronics Letters. (2004) 40, no. 14, 852–853, https://doi.org/10.1049/el:20040584, 2-s2.0-3142748230.
10.1049/el:20040584
Web of Science® Google Scholar
12 Lee C.-Y., Low-latency bit-parallel systolic multiplier for irreducible x^m + xⁿ + 1 with GCD(m, n) = 1, IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences. (2003) E86A, no. 11, 2844–2852, 2-s2.0-0345376118.
Web of Science® Google Scholar
13 Shen H. and Jin Y., Low complexity bit parallel multiplier for GF(2^m) generated by equally-spaced trinomials, Information Processing Letters. (2008) 107, no. 6, 211–215, https://doi.org/10.1016/j.ipl.2008.01.012, MR2438028.
10.1016/j.ipl.2008.01.012
Web of Science® Google Scholar
14 Weimerskirch A. and Paar C., Generalizations of the Karatsuba Algorithm for Efficient Implementations, 2003, http://www.crypto.ruhr-uni-bochum.de/imperia/md/content/texte/kaweb.pdfAvailableon.
Google Scholar
15 Montgomery P. L., Five, six, and seven-term Karatsuba-like formulae, IEEE Transactions on Computers. (2005) 54, no. 3, 362–369, 2-s2.0-14844351609, https://doi.org/10.1109/TC.2005.49.
10.1109/TC.2005.49
Web of Science® Google Scholar
16 Fan H., Gu M., Sun J., and Lam K.-Y., Obtaining more Karatsuba-like formulae over the binary field, IET Information Security. (2012) 6, no. 1, 14C–19, https://doi.org/10.1049/iet-ifs.2010.0114, 2-s2.0-84863374298.
10.1049/iet-ifs.2010.0114
Web of Science® Google Scholar
17 Fan H., Sun J., Gu M., and Lam K.-Y., Overlap-free Karatsuba-Ofman polynomial multiplication algorithms, IET Information Security. (2010) 4, no. 1, 8–14, 2-s2.0-75949124148, https://doi.org/10.1049/iet-ifs.2009.0039.
10.1049/iet-ifs.2009.0039
Web of Science® Google Scholar
18 Rodríguez-Henríquez F. and Koç Ç. K., On fully parallel Karatsuba multipler for GF(2^m), proceedings of the International Conference on Computer Science and Technology (CST, ′03), 2003, ATA Press, 405–410.
Google Scholar
19 Von Zur Gathen J. and Shokrollahi J., Efficient FPGA-based Karatsuba multipliers for polynomial over ₂, 3897, Proceedings of the 12th Workshop on Selected Areas in Cryptography (SAC ′05), Springer, 359–359, MR2241649.
Google Scholar
20 Chang K.-Y., Hong D., and Cho H.-S., Low complexity bit-parallel multiplier for GF(2^m) defined by all-one polynomials using redundant representation, IEEE Transactions on Computers. (2005) 54, no. 12, 1628–1630, https://doi.org/10.1109/TC.2005.199, 2-s2.0-30344442131.
10.1109/TC.2005.199
Web of Science® Google Scholar
21 Li Y., Ma X., Zhang Y., and Qi C., Mastrovito form of non-recursive karatsuba multiplier for all trinomials, IEEE Transactions on Computers. (2017) 66, no. 9, 1573–1584, https://doi.org/10.1109/TC.2017.2677913, 2-s2.0-85029480390.
10.1109/TC.2017.2677913
Web of Science® Google Scholar
22 Fan H., A Chinese remainder theorem approach to bit-parallel GF(2ⁿ) polynomial basis multipliers for irreducible trinomials, IEEE Transactions on Computers. (2016) 65, no. 2, 343–C352, https://doi.org/10.1109/TC.2015.2428704, 2-s2.0-84962076639.
10.1109/TC.2015.2428704
Web of Science® Google Scholar
23 Zhang T. and Parhi K. K., Systematic design of original and modified Mastrovito multipliers for general irreducible polynomials, IEEE Transactions on Computers. (2001) 50, no. 7, 734–749, 2-s2.0-0035392553, https://doi.org/10.1109/12.936239.
10.1109/12.936239
Web of Science® Google Scholar

Citing Literature

All articles

Efficient Nonrecursive Bit-Parallel Karatsuba Multiplier for a Special Class of Trinomials

Abstract

1. Introduction

2. Preliminary

3. Mastrovito Multiplier Using a Three-Term Karatsuba Algorithm

3.1. Computation of S₁modf(x)

3.2. Computation of S₂modf(x)

4. Theoretic Comparison

5. Conclusion

Conflicts of Interest

Acknowledgments

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Efficient Nonrecursive Bit-Parallel Karatsuba Multiplier for a Special Class of Trinomials

Abstract

1. Introduction

2. Preliminary

3. Mastrovito Multiplier Using a Three-Term Karatsuba Algorithm

3.1. Computation of S1modf(x)

3.2. Computation of S2modf(x)

4. Theoretic Comparison

5. Conclusion

Conflicts of Interest

Acknowledgments

References

Citing Literature

References

Related

Information

3.1. Computation of S₁modf(x)

3.2. Computation of S₂modf(x)