Lua uses only a single number type which can be redefined at compile-time. By default this is a double, i.e. a floating-point number with 53 bits of precision. Operations in the range of 32 bit numbers (and beyond) are exact. There is no loss of precision, so there is no need to add an extra integer number type. Modern desktop and server CPUs have fast floating-point hardware — FP arithmetic is nearly the same speed as integer arithmetic. Any differences vanish under the overhead of the Lua interpreter itself.
Even today, many embedded systems lack support for fast FP operations. These systems benefit from compiling Lua with an integer number type (with 32 bits or more).
The different possible number types and the use of FP numbers cause some problems when defining bitwise operations on Lua numbers. The following sections define the operational semantics and try to explain the rationale behind them.
Input and Output Ranges
- Bitwise operations cannot sensibly be applied to FP numbers (or their underlying bit patterns). They must be converted to integers before operating on them and then back to FP numbers.
- It's desirable to define semantics that work the same across all platforms. This dictates that all operations are based on the common denominator of 32 bit integers.
- The float type provides only 24 bits of precision. This makes it unsuitable for use in bitwise operations. Lua BitOp refuses to compile against a Lua installation with this number type.
- Bit operations only deal with the underlying bit patterns and generally ignore signedness (except for arithmetic right-shift). They are commonly displayed and treated like unsigned numbers, though.
- But the Lua number type must be signed and may be limited to 32 bits. Defining the result type as an unsigned number would not be cross-platform safe. All bit operations are thus defined to return results in the range of signed 32 bit numbers (converted to the Lua number type).
- Hexadecimal literals are treated as unsigned numbers by the Lua parser before converting them to the Lua number type. This means they can be out of the range of signed 32 bit integers if the Lua number type has a greater range. E.g. 0xffffffff has a value of 4294967295 in the default installation, but may be -1 on embedded systems.
- It's highly desirable that hex literals are treated uniformly across systems when used in bitwise operations. All bit operations accept arguments in the signed or the unsigned 32 bit range (and more, see below). Numbers with the same underlying bit pattern are treated the same by all operations.
Modular Arithmetic
Arithmetic operations on n-bit integers are usually based on the rules of » modular arithmetic modulo 2n. Numbers wrap around when the mathematical result of operations is outside their defined range. This simplifies hardware implementations and some algorithms actually require this behavior (like many cryptographic functions).
E.g. for 32 bit integers the following holds: 0xffffffff + 1 = 0
Arithmetic modulo 232 is trivially available if the Lua number type is a 32 bit integer. Otherwise normalization steps must be inserted. Modular arithmetic should work the same across all platforms as far as possible:
- For the default number type of double, arguments can be in the range of ±251 and still be safely normalized across all platforms by taking their least-significant 32 bits. The limit is derived from the way doubles are converted to integers.
- The function bit.tobit can be used to explicitly normalize numbers to implement modular addition or subtraction. E.g. bit.tobit(0xffffffff + 1) returns 0 on all platforms.
- The limit on the argument range implies that modular multiplication is usually restricted to multiplying already normalized numbers with small constants. FP numbers are limited to 53 bits of precision, anyway. E.g. (230+1)2 does not return an odd number when computed with doubles.
BTW: The tr_i function shown here is one of the non-linear functions of the (flawed) MD5 cryptographic hash and relies on modular arithmetic for correct operation. The result is fed back to other bitwise operations (not shown) and does not need to be normalized until the last step.
Restricted and Undefined Behavior
The following rules are intended to give a precise and useful definition (for the programmer), yet give the implementation (interpreter and compiler) the maximum flexibility and the freedom to apply advanced optimizations. It's strongly advised not to rely on undefined or implementation-defined behavior.
- All kinds of floating-point numbers are acceptable to the bitwise
operations. None of them cause an error, but some may invoke undefined
behavior:
- -0 is treated the same as +0 on input and is never returned as a result.
- Passing ±Inf, NaN or numbers outside the range of ±251 as input yields an undefined result.
- Non-integral numbers may be rounded or truncated in an
implementation-defined way. This means the result could differ between
different BitOp versions, different Lua VMs, on different platforms or even
between interpreted vs. compiled code
(as in » LuaJIT).
Avoid passing fractional numbers to bitwise functions. Use math.floor() or math.ceil() to get defined behavior.
- Lua provides auto-coercion of string arguments to numbers by default. This behavior is deprecated for bitwise operations.