Update README.md

Signed-off-by: David Rotermund <54365609+davrot@users.noreply.github.com>
2025-07-09 19:00:03 +02:00 · 2023-12-01 15:04:45 +01:00 · 2023-12-01 15:04:45 +01:00 · 05373d2adb
commit 05373d2adb
parent 3cd45a767d
1 changed files with 71 additions and 0 deletions
--- a/matlab/2/README.md
+++ b/matlab/2/README.md
@ -45,3 +45,74 @@ Two further examples for 8-bit numbers:
 10101010 $=0 \star 2^0+1 \star 2^1+0 \star 2^2+1 \star 2^3 +0 \star 2^4+1 \star 2^5+0 \star 2^6+1 \star 2^7 = 170$

 To represent negative numbers, a small trick is necessary: one specific bit codes for the sign of the number. In an $n$-bit number, one would for example reserve the $n$-th it for the sign, and the remaining $n-1$ bits code, as usual, a binary number $z$: If the $n$-th bit is not set, then the result is by default $+z$. If the $n$-th bit is set, the result is $-(2^{n-1})+z$. For illustration, again a small table:
+
+|dual system|	decimal system|
+| ------------- |:-------------:|
+|01111111|	+127|
+|01111110|	+126|
+|$\vdots$|	$\vdots$|
+|00000010|	+2|
+|00000001|	+1|
+|00000000|	+0|
+|11111111|	-1|
+|11111110|	-2|
+|$\vdots$|	$\vdots$|
+|10000010|	-126|
+|10000001|	-127|
+|10000000|	-128|
+
+Certain bit lengths have names:
+| | |
+| ------------- |:-------------:|
+| 1 Byte|	8 Bit|
+|1 Word|	16 Bit|
+|1 Kilobyte|	1024 Byte|
+|1 Megabyte|	1024 Kilobyte|
+|1 Gigabyte|	1024 Megabyte|
+|1 Terabyte|	1024 Gigabyte|
+
+## Representation of Real Numbers and Numerical Errors
+Real numbers are, by their nature, analogue quantities. Hence we would expect the handling of these numbers on digital computers not to be completely problem-free. Present digital computers usually represent real numbers as floating-point numbers.
+
+$\mbox{floating-point number} = \mbox{mantissa} \cdot \mbox{basis}^{\mbox{exponent}} $
+
+Thereby, the precision, with which the real number can be represented, is determined by the number of available bits."Simple precision" requires 4 Bytes, for "double precision" 8~Bytes are needed. The latter is the default configuration in Matlab. The IEEE format of double precision uses 53~Bits for the mantissa, 11~Bits for the exponent and for the basis the remaining ~2. One Bit of the mantissa respectively the exponent are used for the sign of the quantity. Thus, the exponent can vary between$-1024$ and $+1023$. The mantissa always represents a value in the interval $[1, 2[$ in the IEEE notation. Here, the $52$ Bits are utilized to add up fractions of exponents of 2. The value of the mantissa yields mantissa=$1+\sum_{i=1}^{52} b_i 2^{-i}$, with $b_i=1$ , if the $i$-th bit in the mantissa is set.
+
+## Range Error
+The maximal range of the floating-point numbers is determined by the number of bits used to code for the exponent. A typical number for single precision is
+
+$2^{\pm 127} \approx 10^{\pm 38}$
+
+and for double precision
+
+$2^{\pm 1023} \approx 10^{\pm 308} $
+
+Via application of arithmetic operations on these numbers, the range can be exceeded. The error occurring in that case is named a range error. As an example we consider the Bohr radius in SI units
+
+$a_0 = \frac{4\pi\varepsilon_0\hbar^2}{m_ee^2}\approx 5.3\times 10^{-11} \mbox{m} $
+
+The quantity $\hbar$ is Planck's quantum of action divided by $2\pi$. Bohr's radius is in the range of single precision floating-point numbers. However, the same does not hold for the numerator $4\pi\varepsilon_0\hbar^2 \approx 1.24\cdot 10^{-78}\mbox{KgC}^2\mbox{m}$ and the denominator $m_ee^2 \approx 2.34\times 10^{-68}\mbox{KgC}^2$. I.e. neither the numerator nor the denominator can be represented as a single precision floating-point number. Hence, the calculation of Bohr's radius by the given formula can be problematic. A simple solution of this problem lies in the use of natural units, such as Bohr's radius, for distances, etc.
+
+An even bigger problem can be illustrated by the calculation of the factorial. The factorial is defined as
+
+$n! = n\cdot(n-1)\cdot(n-2)\cdot\ldots3\cdot 2\cdot 1 $
+
+In Matlab, it can be easily verified by using the function factorial(n), that the factorial for $n>170$ can not be represented, even with double precision numbers. A way out is provided by the use of logarithms, since the logarithm of a bigger number still gives moderately small values, e.g. $\log_{10}(10^{100}) = 100$. It ensues that
+
+$\ln(n!) = \ln(n) + \ln(n-1) + \ldots + \ln(3) + \ln(2) + \ln(1) $
+
+For bigger $n$, the evaluation of this expression is, however, to laborious. If $n>30$, one is advised to use Stirling's formula
+
+$\ln(n!) = n\ln(n)-n+\frac{1}{2}\ln(2\pi n)+\ln\left(1+\frac{1}{12n}+\frac{1}{288n^2}+\ldots\right) $
+
+The factorial $n!$ can than be written as the following
+
+$n! = \mbox{mantissa}\times 10^{\mbox{exponent}} $
+
+To get the mantissa and the exponent, we form the logarithm to the basis 10 (reminder: $\log_{10}(x) = \ln(x)/\ln(10)$)
+
+$\log_{10}(n!) = \log_{10}(\mbox{mantissa})+{\mbox{exponent}} $
+
+We now associate the integer part of $\log_{10}(n!)$ with the exponent. The post-decimal places are associated with the mantissa, i.e. mantissa = $10^a$ with $a = \log_{10}(n!)-\mbox{exponent}$. 
+
+From these examples we learn that range errors can usually be circumvented with a little creativity.