Algorithms that compute do not run forever. They are nevertheless capable of computing up to any desired accuracy .
Choose any and compute .
For example (setting ):
Compare the results with
It appears that the multiplication of the input by gives an accuracy of k decimal digits.[note 2]
To compute the (entire) decimal representation of , one can execute an infinite number of times, increasing by a factor at each pass.
Assume that in the next program () the procedure is already defined and — for the sake of the argument — that all variables can hold integers of unlimited magnitude.
Then will print the entire decimal representation of .[note 3]
// Print sqrt(y), without haltingvoidsqrtForever(unsignedinty){unsignedintresult=isqrt(y);printf("%d.",result);// print result, followed by a decimal pointwhile(true)// repeat forever ...{y=y*100;// theoretical example: overflow is ignoredresult=isqrt(y);printf("%d",result%10);// print last digit of result}}
For example, if one computes using binary search, one obtains the sequence
This computation takes 21 iteration steps, whereas linear search (ascending, starting from ) needs 1414 steps.
Algorithm using Newton's method
One way of calculating and is to use Heron's method, which is a special case of Newton's method, to find a solution for the equation , giving the iterative formula
One can prove[citation needed] that is the largest possible number for which the stopping criterion
ensures in the algorithm above.
In implementations which use number formats that cannot represent all rational numbers exactly (for example, floating point), a stopping constant less than 1 should be used to protect against round-off errors.
Domain of computation
Although is irrational for many , the sequence contains only rational terms when is rational. Thus, with this method it is unnecessary to exit the field of rational numbers in order to calculate , a fact which has some theoretical advantages.
Using only integer division
For computing for very large integers n, one can use the quotient of Euclidean division for both of the division operations. This has the advantage of only using integers for each intermediate value, thus making the use of floating point representations of large numbers unnecessary. It is equivalent to using the iterative formula
By using the fact that
one can show that this will reach within a finite number of iterations.
In the original version, one has for , and for . So in the integer version, one has and until the final solution is reached. For the final solution , one has and , so the stopping criterion is .
However, is not necessarily a fixed point of the above iterative formula. Indeed, it can be shown that is a fixed point if and only if is not a perfect square. If is a perfect square, the sequence ends up in a period-two cycle between and instead of converging.
Example implementation in C
// Square root of integerunsignedintint_sqrt(unsignedints){// Zero yields zero// One yields oneif(s<=1)returns;// Initial estimate (must be too high)unsignedintx0=s/2;// Updateunsignedintx1=(x0+s/x0)/2;while(x1<x0)// Bound check{x0=x1;x1=(x0+s/x0)/2;}returnx0;}
Numerical example
For example, if one computes the integer square root of 2000000 using the algorithm above, one obtains the sequence
In total 13 iteration steps are needed. Although Heron's method converges quadratically close to the solution, less than one bit precision per iteration is gained at the beginning. This means that the choice of the initial estimate is critical for the performance of the algorithm.
When a fast computation for the integer part of the binary logarithm or for the bit-length is available (like e.g. std::bit_width in C++20), one should better start at
which is the least power of two bigger than . In the example of the integer square root of 2000000, , , and the resulting sequence is
In this case only four iteration steps are needed.
Digit-by-digit algorithm
The traditional pen-and-paper algorithm for computing the square root is based on working from higher digit places to lower, and as each new digit pick the largest that will still yield a square . If stopping after the one's place, the result computed will be the integer square root.
Using bitwise operations
If working in base 2, the choice of digit is simplified to that between 0 (the "small candidate") and 1 (the "large candidate"), and digit manipulations can be expressed in terms of binary shift operations. With * being multiplication, << being left shift, and >> being logical right shift, a recursive algorithm to find the integer square root of any natural number is:
definteger_sqrt(n:int)->int:assertn>=0,"sqrt works for only non-negative inputs"ifn<2:returnn# Recursive call:small_cand=integer_sqrt(n>>2)<<1large_cand=small_cand+1iflarge_cand*large_cand>n:returnsmall_candelse:returnlarge_cand# equivalently:definteger_sqrt_iter(n:int)->int:assertn>=0,"sqrt works for only non-negative inputs"ifn<2:returnn# Find the shift amount. See also [[find first set]],# shift = ceil(log2(n) * 0.5) * 2 = ceil(ffs(n) * 0.5) * 2shift=2while(n>>shift)!=0:shift+=2# Unroll the bit-setting loop.result=0whileshift>=0:result=result<<1large_cand=(result+1)# Same as result ^ 1 (xor), because the last bit is always 0.iflarge_cand*large_cand<=n>>shift:result=large_candshift-=2returnresult
Traditional pen-and-paper presentations of the digit-by-digit algorithm include various optimizations not present in the code above, in particular the trick of pre-subtracting the square of the previous digits which makes a general multiplication step unnecessary. See Methods of computing square roots § Binary numeral system (base 2) for an example.[1]
An example algorithm for 64-bit unsigned integers is below. The algorithm:
Normalizes the input inside u64_isqrt.
Calls u64_normalized_isqrt_rem, which requires a normalized input.
Calls u32_normalized_isqrt_rem with the most-significant half of the normalized input's bits, which will already be normalized as the most-significant bits remain the same.
Continues on recursively until there's an algorithm that's faster when the number of bits is small enough.
u64_normalized_isqrt_rem then takes the returned integer square root and remainder to produce the correct results for the given normalized u64.
u64_isqrt then denormalizes the result.
/// Performs a Karatsuba square root on a `u64`.pubfnu64_isqrt(mutn: u64)-> u64{ifn<=u32::MAXasu64{// If `n` fits in a `u32`, let the `u32` function handle it.returnu32_isqrt(nasu32)asu64;}else{// The normalization shift satisfies the Karatsuba square root// algorithm precondition "a₃ ≥ b/4" where a₃ is the most// significant quarter of `n`'s bits and b is the number of// values that can be represented by that quarter of the bits.//// b/4 would then be all 0s except the second most significant// bit (010...0) in binary. Since a₃ must be at least b/4, a₃'s// most significant bit or its neighbor must be a 1. Since a₃'s// most significant bits are `n`'s most significant bits, the// same applies to `n`.//// The reason to shift by an even number of bits is because an// even number of bits produces the square root shifted to the// left by half of the normalization shift://// sqrt(n << (2 * p))// sqrt(2.pow(2 * p) * n)// sqrt(2.pow(2 * p)) * sqrt(n)// 2.pow(p) * sqrt(n)// sqrt(n) << p//// Shifting by an odd number of bits leaves an ugly sqrt(2)// multiplied in.constEVEN_MAKING_BITMASK: u32=!1;letnormalization_shift=n.leading_zeros()&EVEN_MAKING_BITMASK;n<<=normalization_shift;let(s,_)=u64_normalized_isqrt_rem(n);letdenormalization_shift=normalization_shift/2;returns>>denormalization_shift;}}/// Performs a Karatsuba square root on a normalized `u64`, returning the square/// root and remainder.fnu64_normalized_isqrt_rem(n: u64)-> (u64,u64){constHALF_BITS: u32=u64::BITS>>1;constQUARTER_BITS: u32=u64::BITS>>2;constLOWER_HALF_1_BITS: u64=(1<<HALF_BITS)-1;debug_assert!(n.leading_zeros()<=1,"Input is not normalized: {n} has {} leading zero bits, instead of 0 or 1.",n.leading_zeros());lethi=(n>>HALF_BITS)asu32;letlo=n&LOWER_HALF_1_BITS;let(s_prime,r_prime)=u32_normalized_isqrt_rem(hi);letnumerator=((r_primeasu64)<<QUARTER_BITS)|(lo>>QUARTER_BITS);letdenominator=(s_primeasu64)<<1;letq=numerator/denominator;letu=numerator%denominator;letmuts=(s_prime<<QUARTER_BITS)asu64+q;letmutr=(u<<QUARTER_BITS)|(lo&((1<<QUARTER_BITS)-1));letq_squared=q*q;ifr<q_squared{r+=2*s-1;s-=1;}r-=q_squared;return(s,r);}
In programming languages
Some programming languages dedicate an explicit operation to the integer square root calculation in addition to the general case or can be extended by libraries to this end.