Table of Contents

Table of Contents

Class Groups: A Comprehensive Primer

Table of ContentsCloseClose

In 2018, Benedikt Bunz and Ben Fisch, both PhD students at Stanford University, released a paper along with Dan Boneh titled “Batching Techniques for Accumulators with Applications to IOPs and Stateless Blockchains”. In it they use some basic group theory to build a dynamic accumulator, which allows for storing and deleting elements in addition to the use of both membership and non-membership proofs. It can be used to create a vector commitment data structure analogous to Merkle trees, with the main difference being that it allows for constant-sized inclusion proofs, where a Merkle tree has O(logn) sized inclusion proofs where n is the number of elements being stored. If the stored set is big enough, this can be a pretty big deal.

One of the main security assumptions their construction uses is that it relies on a group of unknown order. In particular the strong RSA assumption must hold, meaning it is hard to compute a chosen root of a random group element. There are two good candidates for such a group. The simpler of the two is known as an RSA group, or a multiplicative group of order N where N=pq for some unknown factorization. This however requires a trusted setup, as whoever generates the modulus N must be trusted to discard p and q. The alternative is known as a class group of imaginary quadratic order, which eliminates the need for a trusted setup and which we will be exploring in this post.

Class groups come from a long line of mathematical research, and were originally discovered by Gauss in his Disquisitiones Arithmeticae in 1801. The math that’s been developed on top of his work in the past two centuries involves some decently complex algebra. I’ll explain most of the concepts used but will expect that you know what groups, rings, fields, homomorphisms, and isomorphisms are in an algebraic context. Feel free to look them up if not.

This post is meant to be an introduction to what a class group is and to summarize the most important results to consider when implementing them as a group of unknown order; detailed proofs can be found in the list of further readings below.

What is a class group?

There are two equivalent ways to understand the class group which are isomorphic to one another. One coming from a subfield of mathematics known as algebraic number theory is known as the ideal class group, which is the quotient of fractional ideals by fractional principal ideals JK/PK of a ring of integers OK of a quadratic extension K=Q(d) of the rational numbers. Later we’ll walk through what this means step-by-step. The other way to look at the class group is known as the form class group and comes from the study of binary quadratic forms, or equations of the form

ax2+bxy+cy2=n

by working with an equivalence relation over forms which all have the same discriminant b24ac.

These two views are actually pretty different; it’s not at all obvious that they represent the same object! Either is paramaterized by its integer discriminant Δ, as there is a one-to-one correspondence between class groups and valid discriminants. When using the group for cryptographic purposes as a group of unknown order there are three main choices to be made:

  1. What discriminant should be chosen
  2. How to represent the group in a numerical setting
  3. What algorithms should be used when performing the group operation

The ideal class group makes it easier to understand how and why a particular form of discriminant should be chosen. As we will see, we want the discriminant to be the negation of a prime congruent to 3 mod 4, or p where p3 (mod 4). On the other hand, the form class group is easier to represent and perform operations on numerically. With this in mind, we will start with the ideal class group and move on to the form class group from there.

Ideal Class Group

In basic field theory there is the concept of a field extension, or a field containing another field as a subfield. The degree of an extension is the size of the larger field’s basis over the smaller base field. An example of this is the complex numbers C, which extend the real numbers R by adding in 1=i, so that C=R(i). This is known as a degree 2 or quadratic extension, because the basis for C as a vector space over R is of size 2, and is given by (1,i).

Similarly, we can construct generalizations of the rational numbers by adding in d to Q, for some square-free number d. In this case a basis for Q(d) over Q is also size 2 and is given by (1,d), making Q(d) a quadratic extension of Q. More formally, Q(d) is the smallest field containing both Q and d.

Once we have our quadratic extension K=Q(d), we can also get a corresponding generalization of the integers known as the ring of integers of K, denoted OK. To obtain OK, we look at K and take all of its elements which are the roots of some monic polynomial with integer coefficients, also known as an integral polynomial, i.e. the set of all αK such that there exists some polynomial

p(x)=xn+bn1xn1+...b1x+b0

where bn1,...,b0Z and p(α)=0. It turns out that OQ=Z, so that the ring of integers of the rational numbers is simply the integers. In fact, for any finite-degree extension K of Q, OK contains Z. As its name suggests, OK forms a ring under addition and multiplication.

Something that would help us to understand OK would be to find some analogue of a basis over a vector space for it. We are helped here by the concept of a module, which is a generalization of vector spaces for rings. As an example, a Z -module M consists of the integers Z, which form a ring, acting on some abelian group (M,+). In other words, the ring of integers acts like a set of scalars and the abelian group like a set of vectors. Modules do not have as many guarantees as vector spaces; for example, a module may not have a basis.

Because every ring is an abelian group under addition, every ring is a Z -module. This implies we can act on any ring of integers OK by the integers Z in a way analogous to acting on vectors by scalars. While not every module has a basis as a Z -module (a “Z -basis”), it is true that every ring of integers OK has a Z -basis. More specifically, if K is a degree n extension of Q, then there is some b1,...,bnOK such that any xOK can be uniquely written as

x=i=1naibi

where a1,...,anZ.

What are the possible Z -bases for a ring of integers of a quadratic extension? This turns out to be pretty simple. We know dOK since dK=Q(d) and it is the root of the integral polynomial x2d. It also happens to be the case that if d1 (mod 4) then for any x+ydOK where x,yZ we can write

x+yd=a+b1+d2

for some a,bZ. It can be shown that no other elements are in OK, giving us two possible Z -basis depending on d. If d2,3 (mod 4) then we have a Z -basis of (1,d), and if d1 (mod 4) we have a Z -basis of (1,1+d2). Another way of phrasing this is to say that if d1 (mod 4), then OK=Z[1+d2] and OK=Z[d] otherwise.

To define the discriminant of a finite extension K of Q, we first need the concept of an embedding of K into some field F, which is simply an injective ring homomorphism from K into F. If we have n embeddings σ1,...,σn from K into the complex numbers C and a basis b1,...,bn of OK as a Z -module, then the discriminant of K is given by

ΔK=|σ1(b1)σ1(b2)σ1(bn)σ2(b1)σ2(bn)σn(b1)σn(bn)|2

where the notation above means we take the square of the determinant of the given matrix.

For our case where K=Q(d), we have two embeddings of K into C. Letting a+bdQ(d), one is given by σ1(a+bd)=a+bd and the other by σ2(a+bd)=abd. If d>0 then we can consider these to be embeddings of K into R. In practice we want d<0. As we’ll see later this gives us a nicer structure by letting us take a unique “reduced” form for any element when using the form class group as a numerical representation. It also makes it less likely that the class group will be trivial and contain only one element.

Given these embeddings and the Z -bases above it is easy to check that if d1 (mod 4) then ΔK=d and if d2,3 (mod 4) then ΔK=4d. When choosing a discriminant, this makes it more convenient to pick some square-free value Δ1 (mod 4), as otherwise we need to ensure that Δ/4 is square-free.

We’re almost ready to construct the class group itself! To do so we’ll need the concept of an ideal, which is an additive subgroup of a ring with an additional multiplicative “absorption” property. Specifically if we have an ideal IOK of a ring of integers then for any rR and xI, rxI.

For any nZ, nZ is an ideal.

We can generalize the concept of an ideal of a ring to get the idea of a fractional ideal. Intuitively a fractional ideal J of a ring OK is a subset of the ring’s enclosing field K such that we can “clear the denominators” of J. More formally J is a nonzero subset of K such that for some r0OK, rJOK is an ideal of OK.

As an example 54Z is a fractional ideal of Z, since it is a subset of the smallest field Q containing Z and has the property that 4(54Z)=5Z is an ideal of Z.

For any two ideals or fractional ideals I,JOK we can define a form of multiplication on them:

IJ={aibi|aiI,biJ}

Since ideals are closed under addition, this gives the smallest ideal containing every element of both I and J.

A fractional ideal I of OK is called principal if there is some rOK such that rOK=I. In other words, I is generated by a single element of OK.

Finally we need one more concept from group theory known as a quotient group. While I won’t define it formally here, a basic example is given by Z/3Z which has the structure of arithmetic mod 3. It can be formed by taking every element nZ, computing n3Z, then forming a group operation having the same structure as (Z,+) over the 3 resulting distinct sets of integers.

Let JK denote the set of fractional ideals of OK and PK the set of principal fractional ideals of OK. Both form abelian groups under ideal multiplication defined above. This means it makes sense to take the quotient group JK/PK. This quotient group is the ideal class group.

There are two extreme cases we can consider here. If JK=PK, then JK/PK is the trivial group with one element. On the other hand, if no fractional ideals are principal then JK/PK=JK. The order of JK/PK can be interpreted as a measurement of the extent to which OK fails to be a principal ideal domain, or to have all of its ideals be principal. We can even take this a bit further, since for any ring of integers OK being a principal ideal domain is equivalent to every element of OK having a unique factorization. In other words, the order of a class group of K is a measurement of how much its ring of integers OK fails to give each element a unique factorization! Given that OK is effectively a generalization of the integers, it’s pretty neat that we can define this rigorously.

The order or number of elements of JK/PK is known as its class number, and is known to be hard to compute for large discriminants. As with all cryptographic assumptions, this has not been proven but is assumed to be true because no one has broken it efficiently. Currently, a discriminant of at least 687 bits provides as much security as a 1024-bit RSA key, and a discriminant of at least 1208 bits about as much security as a 2048-bit RSA key. These numbers may change with improved attacks on class group constructions, and should not be taken as advice.

Form Class Group

Next we will discuss the form class group, and see how it is related to the ideal class group. The form class group was originally discovered in the study of binary quadratic forms, or functions of the form

f(x,y)=ax2+bxy+cy2

where a,b,cR. For convenience we will write f=(a,b,c) and call f a form. In practice we want to restrict ourselves to the case where a,b,cZ, as our end goal is to use forms to represent the class group in a computer and being able to store forms as integer triples rather than triples of floating point values simplifies things.

All binary quadratic forms in a given form class group have the same discriminant, given by Δf=b24ac. This is identical to the discriminant of the corresponding ideal class group.

A form represents an integer n if f(x,y)=n for some x,yZ.

The form class group is made up of equivalence classes of forms, or sets of forms considered to be equivalent. This is similar to how, when doing arithmetic mod 3, the symbol 1 represents the set of all integers congruent to 1 (mod 3). We say that two forms f1,f2 are equivalent if they represent the same set of integers.

In order to be a valid group, we need a group operation that will give us some representative f3 of an appropriate equivalence class given forms f1,f2 so that f1f2=f3. We also need an identity form g so that fg=f for all forms f, and for any form f we need an inverse f1 so that ff1=g.

We mentioned above the the ideal class group with discriminant ΔZ is isomorphic to the form class group with the same discriminant Δ. In fact, this is only true when Δ<0, and all forms f=(a,b,c) in the group are positive definite, meaning f(x,y)>0 for all possible x,yZ. This is equivalent to having both Δf<0 and a>0. From now on, we will assume that any form is positive definite.

As the form class group is composed of equivalence classes of forms, it would also be good if we could reduce any form of an equivalence class down to one unique element. Since we also want to represent elements f=(a,b,c) in terms of bits, it would also be good if this reduced element was reasonably small. It turns out that we can define a reduction operation that gives us both of these desirable properties.

We can break down the reduction operation into two pieces; a “normalization” operation and a “reduction” operation, with the requirement that a form must be normalized before it is reduced.

A form f=(a,b,c) is normal if a<ba. We can define a normalization operation by

η(a,b,c):=(a,b+2ra,ar2+br+c)

where r=ab2a.

When we normalize a form f, we want the resulting normalized form f to be equivalent to f. How do we know this is actually the case? It turns out we can understand equivalence of forms using actions by a certain class of matrices known as SL2, or the “special linear group” of invertible matrices with determinant equal to 1.

Two forms f1,f2 are in fact equivalent if there exists α,β,γ,δZ such that

f1(αx+βy,γx+δy)=f2(x,y) αδβγ=1

This is actually only partially true, as we can relax the second requirement to be αδβγ=±1. However, this won’t actually give us a valid equivalence relation, i.e. if we let denote equivalence then f1f2 and f2f3 wouldn’t necessarily imply f1f2f3.

The “correct” form of equivalence mentioned above gives us an action of SL2 on forms, so that if f1,f2 are equivalent then there is some invertible

A=[αβγδ]

with det(A)=1 such that f1=Af2.

With this is mind, we can show that f and its normalized form f=η(f) are equivalent by the matrix

A=[1r01]

Once we’ve normalized a form, we can then reduce it to obtain its unique reduced equivalent form. A form f=(a,b,c) is reduced if it is normal and a<c or a=c and b0.

We can define a reduction operation as

ρ(a,b,c):=(c,b+2sc,cs2bs+a)

where s=c+b2c. Unlike the normalization operation η, it may need multiple iterations before our form is reduced. To reduce a form f, we can compute fη(f) then repeatedly compute fρ(f) until f is reduced.

Similar to normalized forms, we can see that a reduced form f of f is equivalent to f using the matrix

A=[011r]

How do we know reduced elements are relatively small? If Δf<0 then for a reduced form f=(a,b,c) we have that

a|Δf|3

Because in a reduced form |b|a and c=b2Δf4a, the above upper bound on the size of a implies that reduce elements as a whole tend to be small, having a bit representation at least as small as that of Δf. This makes the group operation on reduced elements relatively efficient.

There is also a reasonable upper bound on the number of reduction steps required, given by log2(a|Δ|)+2.

The group operation itself, known as “form composition”, is a bit complicated. The basic idea is that, given two forms f1,f2 with the same discirminant Δ, we can multiply them together and use a change of variables to obtain a third form f3 such that f1f2=f3. More exactly, the game is to find integers p,q,r,s,p,q,r,s,α,β,γ so that

X=px1x2+qx1y2+ry1x2+sy1y2Y=px1x2+qx1y2+ry1x2+sy1y2f3(x,y)=αx2+βxy+γy2f3(X,Y)=f1(x,y)f2(x,y)

Let LinCong(a,b,m) be an algorithm which solves a linear congruence of the form axb (mod m) by finding some x=μ+νn where nZ and outputs (μ,ν). Given two forms f1=(a,b,c) and f2=(α,β,γ), we can define a group operation on them as follows:

  1. Set g12(b+β), h12(bβ), $w ← gcd(a, α, g)$
  2. Set jw, saw, tαw, ugw
  3. Set $(μ, ν) ← LinCong(tu, hu + sc, st)$
  4. Set $(λ, ρ) ← LinCong(tv, h - tμ, s)$
  5. Set kμ+νλ, lkths, mtukhucsst.
  6. Return f3=(st,ju(kt+ls),kljm).

In practice it’s best to always reduce the result f3 after performing composition. This way we are guaranteed that the multiplication of two forms takes O(log2|Δ|) bit operations where Δ is the discriminant of the group being used. This is not guaranteed if the two inputs forms are not reduced.

In order to be a group operation forms under composition must have an identity element. If Δ<0 this turns out to be f=(1,k,k2Δ4) where k=Δ (mod 2).

For any form f=(a,b,c), its inverse under form composition is given by f1=(a,b,c).

We’re now done constructing the form class group! We have a group operation, a way to get a unique representative element from each equivalence class of forms, an identity, and inverses.

There is one very important optimization we can do. Above we mentioned that we want to choose the negation of a prime p as our discriminant. It turns out that this lets us simplify our composition algorithm when f1=f2, since using Δ=p implies that gcd(a,b)=1 for any form f=(a,b,c). Unlike much of the math discussed in this post, this is pretty easy to see. If gcd(a,b)=n1 then for some a,bZ we have that a=na and b=nb, implying Δ=b24ac=n(bb4ac) which is impossible because p can’t be divisible by n.

This, in addition to the fact that a=α, b=β, c=γ gives us the simplified squaring algorithm below:

  1. Set (μ,ν) LinCong(b,c,a)
  2. Return f2=(a2,b2αμ,μ2bμca).

This is a big deal for efficiency, since we can compute exponentiation using repeated squaring.

Isomorphism

Going back to the ideal class group, there is another construction equivalent to the quotient group JK/PK. Similar to equivalence of forms, we can define an equivalence of ideals in JK by saying that two fractional ideals I,J of JK are equivalent if there is some α0K such that αI=J. The equivalence classes formed by this relation are exactly the elements of JK/PK. We can similarly represent fractional ideals by their at most 2 generating elements, so that if I is generated by α,β we can represent it by (α,β). We can also get a unique “reduced” ideal from each equivalence class.

Ideal and form class groups are isomorphic when the discriminant ΔZ being used is negative. In some sense this means multiplication of fractional ideals in the ideal setting and form composition in the form class group are really the same operation, since we can move back and forth between corresponding equivalence classes of fractional ideals and of forms.

We need just a few more tools before defining the isomorphism itself. If K is a finite field extension of F, so that K is a finite-dimensional vector space over F, then for any αK the map mα(x)=αx is an F -linear transformation from K into itself. The field norm N(α) is the determinant of the matrix of this linear transformation. The trace Tr of α is the trace of this matrix.

In our case where K=Q(d), for any x=a+bdK we have N(x)=a2db2, implying N(x) is positive when d<0.

Next, if I is a non-zero fractional ideal of OK, the absolute norm of I is given by the mapping N(I)=|OK/I|, or the order of the quotient of OK by its ideal I.

As their names and notation suggests, the field and absolute norms are related. If an ideal I of ring of integers OK is principal so that there is some αOK such that αOK=I, then N(I)=|N(α)|.

The isomorphism between ideal and form class groups is as follows. If f=(a,b,c) where a,b,cZ and Δf<0, then we can map f to a fractional ideal of the ideal class group with same discriminant by

Φ(a,b,c):=(aZ+b+Δf2Z)

with inverse

Φ1(I):=N(αxβy)N(I)

where (α,β) is some Z -basis of the fractional ideal I. We can see that the inverse maps a fractional ideal to a binary quadratic form using the following identity:

N(αx+βy)=N(α)x2+Tr(αβ)xy+N(β)y2

where for some element of a ring of integers x=a+bd we denote its conjugate by x=abd.

If Δ<0 then the coefficient of x2 given by N(α)/N(I) will be positive, meaning the resulting form f=(a,b,c) will be positive definite as Δf<0 and a>0. It can be shown that when Δ<0, Φ is a bijection which preserves the group structure of binary quadratic forms under form composition in mapping to fractional ideals of a ring of integers under ideal multiplication. In other words, Φ is an isomorphism.

Solved and Open Conjectures

We’ll conclude with a handful of conjectures made in some form by Gauss in 1801 which show further why we want to use a negative discriminant:

  1. The class number of the ideal class group of Q(d) converges to infinity as d.
  2. There are exactly 13 negative discriminants having a class number of 1, in particular -3, -4, -7, -8, -11, -12, -16, -19, -27, -28, -43, -67, -163.
  3. There are infinitely many positive discriminants associated with class groups having a class number of 1.

The first was proven in 1934 by Hans Heilbronn, the second in 1967 by Heegner, Stark and Baker. The third remains open.

Further Reading

Author: Michael Straka

Created: 2024-09-29 Sun 15:32