Linear algebra over p-adics
Assumptions on the base ring
Throughout this chapter, we continue using some notation and assumptions introduced in Chapter 1: R is our base ring (typically, Z or k[X]), p is a non-zero element in R (typically, a prime in Z or X ∈k[X]) and Rp is the completion of R for the p-adic topology (so we get for instance the p-adic integers, or the power series ring k[[X]]). In order to simplify some considerations below regarding the notion of rank of a matrix over a ring, we will make the following assumption in all this chapter: both R and Rp are domains; this is the case in the examples above. As before, we fix a set M of representatives of R/(p), which allows us to define the length λ(a) of a non zero p-adic a ∈ Rp; recall that we make the assumption that the elements of R ⊂ Rp have finite length. We generalize the length function to vectors or matrices of p-adics by setting λ(A) 7 max16i6r,16j6s (λ(Ai,j)) if A ∈Mr×s(Rp).
Problem statement
We consider a linear system of the form A=B · C, where A and B are known, and C is the unknown. The matrix A belongs to Mr×s(Rp) and B ∈Mr×r(Rp) is invertible; we solve the linear system A= B · C for C ∈Mr×s(Rp). We make the natural assumption that s 6 r; the most interesting cases are s = 1 (which amounts to linear system solving) and s=r, which contains in particular the problem of inverting B (our algorithm handles both cases in a uniform manner). A major application of p-adic linear system solving is actually to solve systems over R (in the two contexts above, this means systems with integer, resp. polynomial coefficients), by means of lifting techniques (the paper [MC79] introduced this idea in the case of integer linear systems). In such cases, the solution C belongs to Mr×s(Q), where Q is the fraction field of R, with a denominator invertible modulo p. Using p-adic techniques, we can compute the expansion of C in Mr×s(Rp), from which C itself can be reconstructed by means of rational reconstruction — we will focus on the lifting step, and we will not detail the reconstruction step here. In order to describe such situations quantitatively, we will use the following parameters: the length of the entries of A and B, that is, d7 max (λ(A), λ(B)), and the precision N to which we require C; thus, we will always be able to suppose that d6N. The case N =d corresponds to the resolution of p-adic linear systems proper, whereas solving systems over R often requires to take a precision N ≫d. Indeed, in that case, we deduce from Cramer’s formulas that the numerators and denominators of C have length O(r (d + log (r))), so that we take N of order O(r (d + log (r))) in order to make rational reconstruction possible. For computations with structured matrices, we will use a different, non-trivial representation for B, by means of its “generators”; then, we will denote by d ′ the length of these generators. Details are given below.
Complexity model
Throughout this chapter, we represent all p-adics through their base-M expansion, and we measure the cost of an algorithm by the number of arithmetic operations on p-adics of length 1 (i.e. with only a constant coefficient) it performs, as explained in Chapter 1. The algorithms in this chapter will rely on the notion of shifted decomposition: a shifted decomposition of a p-adic a ∈ Rp is simply a pair (σa, δa) ∈ Rp 2 such that a = σa+ p δa. A simple particular case is (a mod p, a quo p); this is by no means the only choice. This notion carries over to matrices without difficulty. We denote by I(N) the cost of multiplication of two p-adics at precision N and we let R(N) be the cost of multiplying two p-adics at precision N by an on-line algorithm. As in Chapter 1, we let further M(d) denote the arithmetic complexity of multiplication of polynomials of degree at most d over any ring (we will need this operation for the multiplication of structured matrices). Remark that when R = k[X], I and M are the same thing, but this may not be the case anymore over other rings, such as Z. Let next I(r, d) be the cost of multiplying two polynomials in Rp[Y ] with degree at most r and coefficients of length at most d. Since the coefficients of the product polynomial have length at most 2 d + ⌈log2 (r)⌉, we deduce that we can take I(r, d) = O(M(r) I(d + log (r))) 72 Linear algebra over p-adics by working modulo p to the power the required precision; over Rp=k[[X]], the log(r) term vanishes since no carry occurs. Let us focus on the corresponding on-line algorithm. We consider these polynomials as p-adics of polynomials, i.e. p-adic whose coefficients are polynomials in M. We denote by R(r, N) the cost of an on-line multiplication at precision N of polynomials of degrees at most r. As in Chapter 1, this cost is bounded by R(r, N) = O(I(r, N) log (N)) in the case of power series rings or p-adic integers. If the length d ′ of the coefficients of one operand is less than N, the cost reduces to O(N R(r, d′ )/d ′ ). Now, let us turn to matrix arithmetic. We let ω be such that we can multiply r ×r matrices within O(r ω ) ring operations over any ring. The best known bound on ω is ω 62.3727 [CW90, Sto10, VW11]. It is known that, if the base ring is a field, we can invert any invertible matrix in time O(r ω ) base field operations. We will further denote by MM(r, s, d) the cost of multiplication of matrices A, B of sizes (r × r) by (r × s) over Rp, for inputs of length at most d. In our case s 6 r, and taking into account the growth of the length in the output, we obtain that MM(r, s, d) satisfies MM(r, s, d) = O(r 2 s ω−2 I(d + log (r))), since λ(A·B)62 d+⌈log2 (r)⌉; the exponents on r and s are obtained by partitioning A and B into square blocks of size s. Let us now consider the relaxed product of p-adic matrices, i.e. p-adic whose coefficients are matrices over M. We denote by MMR(r, s, N) the cost of the relaxed multiplication of a p-adic matrix of size r × r by a p-adic matrix of size r × s at precision N. As in Chapter 1, we can connect the cost of off-line and online multiplication algorithms by MMR(r, s, N) = O(MM(r, s, N)log (N)) in the case of power series rings or p-adic integers. Likewise, we also notice that the relaxed multiplication of two matrices A, B ∈(Mr×s(R))(p) at precision N with d 7 λ(A) 6 N takes time O(N MMR(r, s, d)/d). Previous work The first algorithm we will mention is due to Dixon [Dix82]; it finds one p-adic coefficient of the solution C at a time and then updates the matrix A. On the other side of the spectrum, one finds Newton’s iteration, which doubles the precision of the solution at each step (and can thus benefit from fast p-adic multiplication); however, this algorithm computes the whole inverse of B at precision N, which can be too costly when we only want one vector solution. Moenck-Carter’s algorithm [MC79] is a variant of Dixon’s algorithm that works with p ℓ -adics instead of p-adics. It takes advantages of fast truncated p-adic multiplication but requires that we compute the inverse of B at precision d (for which Newton iteration is used). Finally, Storjohann’s high-order lifting algorithm [Sto03] can be seen as a fast version of Moenck-Carter’s algorithm, well-suited to cases where d ≪ N. That algorithm was presented for R = k[X] and the result was extended to the integer case in [Sto05]. We believe that the result could carry over to any p-adic ring.