Update

2024-09-08 02:25:27 +02:00 · 2024-09-08 02:25:27 +02:00 · 5134e55998
commit 5134e55998
parent b8828f4d2f
5 changed files with 1828 additions and 1827 deletions
--- a/sqrt.md
+++ b/sqrt.md
@ -30,37 +30,41 @@ TODO

 ## Programming

-TODO
+Square root is a very common operation and oftentimes has to be VERY fast -- it's used a lot for example in [computer graphics](graphics.md) where it may need to be computed several million times per second. For this reason there oftentimes exist special instructions and otherwise hardware accelerated options, you will very likely have such function around on most computers -- in [C](c.md) math standard library there's the `sqrt` [floating point](float.md) function that will probably be very fast. But let's now consider you want to program your own square root, which may happen for example when dealing with [embedded](embedded.md) computers.

-If we need extreme speed, we may use a [look up table](lut.md) with precomputed values.
+If we really need extreme speed, we may always use a [look up table](lut.md) with precomputed values. Of course a table with ALL the values would be very big, but remember we can make a much smaller table (where each item spans a bigger range) that will provide just a quick [estimate](approximation.md) from which you'll make just a few extra iterations towards the correct answer.

 Within desired precision square root can be relatively quickly computed iteratively by [binary search](binary_search.md). Here is a simple [C](c.md) function computing integer square root this way:

+{ I checked the code works for all values with 32 bit integer (remember that C specification allows integers to be as small as 16 bit though, remember to adjust the constants if you suspect you might hit this, overflows may happen). On my computer I measured this to be about 2 (with compiler optimization) to 4 (without) times slower than using the hardware accelerated float point stdlib function. ~drummyfish }
+
 ```
-unsigned int sqrt(unsigned int x)
+unsigned int sqrtInt(unsigned int x)
 {
-  unsigned int l = 0, r = x / 2, m;
+  unsigned int m, l = 0,
+    r = x < 8300000 ? x / 128 + 40 : 65535; // upper bound est. for 32 bit int
+    //r = x < 15000 ? x / 64 + 20 : 255;    // <-- for 16 bit int use this

  while (1)
  {
-    if (r - l <= 1)
+    if (l > r)
      break;

    m = (l + r) / 2;

    if (m * m > x)
-      r = m;
+      r = m - 1;
    else
-      l = m;
+      l = m + 1;
  }

-  return (r * r <= x ? r : l) + (x == 1);
+  return r;
 }
 ```

 TODO: Heron's method

-The following is a **non-iterative [approximation](approximation.md)** of integer square root in [C](c.md) that has acceptable accuracy to about 1 million (maximum error from 1000 to 1000000 is about 7%): { Painstakingly made by me. ~drummyfish }
+The following is a **non-iterative [approximation](approximation.md)** of integer square root in [C](c.md) that has acceptable accuracy to about 1 million (maximum error from 1000 to 1000000 is about 7%): { Painstakingly made by me. This one was even faster than the stdlib function! ~drummyfish }

 ```
 int32_t sqrtApprox(int32_t x)