All remainders are modulo-sixty remainders (divide the number by 60 and return the remainder).
All numbers, including x and y, are positive integers.
Flipping an entry in the sieve list means to change the marking (prime or nonprime) to the opposite marking.
This results in numbers with an odd number of solutions to the corresponding equation being potentially prime (prime if they are also square free), and numbers with an even number of solutions being composite.
The algorithm:
Create a results list, filled with 2, 3, and 5.
Create a sieve list with an entry for each positive integer; all entries of this list should initially be marked non prime (composite).
For each entry number n in the sieve list, with modulo-sixty remainder r :
If r is 1, 13, 17, 29, 37, 41, 49, or 53, flip the entry for each possible solution to 4x2 + y2 = n. The number of flipping operations as a ratio to the sieving range for this step approaches 4√π/15[1] × 8/60 (the "8" in the fraction comes from the eight modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.1117010721276....
If r is 7, 19, 31, or 43, flip the entry for each possible solution to 3x2 + y2 = n. The number of flipping operations as a ratio to the sieving range for this step approaches π√0.12[1] × 4/60 (the "4" in the fraction comes from the four modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.072551974569....
If r is 11, 23, 47, or 59, flip the entry for each possible solution to 3x2 − y2 = n when x > y. The number of flipping operations as a ratio to the sieving range for this step approaches √1.92ln(√0.5+√1.5)[1] × 4/60 (the "4" in the fraction comes from the four modulos handled by this quadratic and the 60 because Atkin calculated this based on an even number of modulo 60 wheels), which results in a fraction of about 0.060827679704....
If r is something else, ignore it completely.
Start with the lowest number in the sieve list.
Take the next number in the sieve list still marked prime.
Include the number in the results list.
Square the number and mark all multiples of that square as non prime. Note that the multiples that can be factored by 2, 3, or 5 need not be marked, as these will be ignored in the final enumeration of primes.
Repeat steps 4 through 7. The total number of operations for these repetitions of marking the squares of primes as a ratio of the sieving range is the sum of the inverse of the primes squared, which approaches the prime zeta function at 2 (equal to 0.45224752004...) minus 1/22 , 1/32, and 1/52 for those primes which have been eliminated by the wheel, with the result multiplied by 16/60 for the ratio of wheel hits per range; this results in a ratio of about 0.01363637571....
Adding the above ratios of operations together, the above algorithm takes a constant ratio of flipping/marking operations to the sieving range of about 0.2587171021...; From an actual implementation of the algorithm, the ratio is about 0.25 for sieving ranges as low as 67.
Pseudocode
The following is pseudocode which combines Atkin's algorithms 3.1, 3.2, and 3.3[1] by using a combined set s of all the numbers modulo 60 excluding those which are multiples of the prime numbers 2, 3, and 5, as per the algorithms, for a straightforward version of the algorithm that supports optional bit-packing of the wheel; although not specifically mentioned in the referenced paper, this pseudocode eliminates some obvious combinations of odd/even xs and ys in order to reduce computation where those computations would never pass the modulo tests anyway (i.e. would produce even numbers, or multiples of 3 or 5):
limit←1000000000// arbitrary search limit// set of wheel "hit" positions for a 2/3/5 wheel rolled twice as per the Atkin algorithms←{1,7,11,13,17,19,23,29,31,37,41,43,47,49,53,59}// Initialize the sieve with enough wheels to include limit:forn←60×w+xwherew∈{0,1,...,limit÷60},x∈s:is_prime(n)←false// Put in candidate primes:// integers which have an odd number of// representations by certain quadratic forms.// Algorithm step 3.1:forn≤limit,n←4x²+y²wherex∈{1,2,...}andy∈{1,3,...}// all x's odd y'sifnmod60∈{1,13,17,29,37,41,49,53}:is_prime(n)←¬is_prime(n)// toggle state// Algorithm step 3.2:forn≤limit,n←3x²+y²wherex∈{1,3,...}andy∈{2,4,...}// only odd x'sifnmod60∈{7,19,31,43}:// and even y'sis_prime(n)←¬is_prime(n)// toggle state// Algorithm step 3.3:forn≤limit,n←3x²-y²wherex∈{2,3,...}andy∈{x-1,x-3,...,1}//all even/oddifnmod60∈{11,23,47,59}:// odd/even combosis_prime(n)←¬is_prime(n)// toggle state// Eliminate composites by sieving, only for those occurrences on the wheel:forn²≤limit,n←60×w+xwherew∈{0,1,...},x∈s,n≥7:ifis_prime(n):// n is prime, omit multiples of its square; this is sufficient // because square-free composites can't get on this listforc≤limit,c←n²×(60×w+x)wherew∈{0,1,...},x∈s:is_prime(c)←false// one sweep to produce a sequential list of primes up to limit:output2,3,5for7≤n≤limit,n←60×w+xwherew∈{0,1,...},x∈s:ifis_prime(n):outputn
This pseudocode is written for clarity; although some redundant computations have been eliminated by controlling the odd/even x/y combinations, it still wastes almost half of its quadratic computations on non-productive loops that do not pass the modulo tests, so it will not be faster than an equivalent wheel-factorized (2/3/5) sieve of Eratosthenes. To improve its efficiency, a method must be devised to minimize or eliminate these non-productive computations.
Explanation
The algorithm completely ignores any numbers with remainder modulo 60 that is divisible by 2, 3, or 5, since numbers with a modulo-60 remainder divisible by one of these three primes are themselves divisible by that prime.
All numbers n with modulo-sixty remainder 1, 13, 17, 29, 37, 41, 49, or 53 have a modulo-four remainder of 1. These numbers are prime if and only if the number of solutions to 4x2 + y2 = n is odd and the number is squarefree (theorem 6.1 of [1]).
All numbers n with modulo-sixty remainder 7, 19, 31, or 43 have a modulo-six remainder of 1. These numbers are prime if and only if the number of solutions to 3x2 + y2 = n is odd and the number is squarefree (theorem 6.2 of [1]).
All numbers n with modulo-sixty remainder 11, 23, 47, or 59 have a modulo-twelve remainder of 11. These numbers are prime if and only if the number of solutions to 3x2 − y2 = n is odd and the number is squarefree (theorem 6.3 of [1]).
None of the potential primes are divisible by 2, 3, or 5, so they cannot be divisible by their squares. This is why squarefree checks do not include 22, 32, and 52.
Computational complexity
It can be computed[1] that the above series of three quadratic equation operations each has a number of operations that is a constant ratio of the range as the range goes to infinity; additionally, the prime-square-free culling operations can be described by the prime zeta function at 2 with constant offsets and factors, so it is also a constant factor of the range as the range goes to infinity. Therefore, the algorithm described above can compute primes up to N using O(N) operations with only O(N) bits of memory.
The page-segmented version implemented by the authors has the same O(N) operations but reduces the memory requirement to just that required by the base primes below the square root of the range of O(N1/2/log(N)) bits of memory plus a minimal page buffer. This is slightly better performance with the same memory requirement as the page-segmented sieve of Eratosthenes, which uses O(N log(log(N))) operations and the same O(N1/2/log(N)) bits of memory[2] plus a minimal page buffer. However, such a sieve does not outperform a sieve of Eratosthenes with maximum practical wheel factorization (a combination of a 2/3/5/7 sieving wheel and pre-culling composites in the segment page buffers using a 2/3/5/7/11/13/17/19 pattern), which, despite having slightly more operations than the Sieve of Atkin for very large but practical ranges, has a constant factor of less complexity per operation by about three times in comparing the per-operation time between the algorithms implemented by Bernstein in CPU clock cycles per operation. The main problem with the page-segmented sieve of Atkin is the difficulty in implementing the prime-square-free culling sequences due to the span between culls rapidly growing far beyond the page buffer span; the time expended for this operation in Bernstein's implementation rapidly grows to many times the time expended in the actual quadratic equation calculations, meaning that the linear complexity of the part that would otherwise be quite negligible becomes a major consumer of execution time. Thus, even though an optimized implementation may again settle to a O(N) time complexity, this constant factor of increased time per operations means that the sieve of Atkin is slower.
A special modified "enumerating lattice points" variation which is not the above version of the sieve of Atkin can theoretically compute primes up to N using O(N / log(log(N))) operations with N1/2+o(1) bits of memory,[1] but this variation is rarely implemented. That is a little better in performance at a very high cost in memory as compared to both the ordinary page-segmented version and to an equivalent but rarely-implemented version of the sieve of Eratosthenes which uses O(N) operations and O(N1/2 log(log(N)) / log(N)) bits of memory.[3][4][5]
Pritchard observed that for the wheel sieves, one can reduce memory consumption while preserving Big-O time complexity, but this generally comes at a cost in an increased constant factor for time per operation due to the extra complexity. Therefore, this special version is likely more of value as an intellectual exercise than a practical prime sieve with reduced real time expended for a given large practical sieving range.