My last idea was somewhat guesswork; the basic theory was that if you start with a mersenne twister-like concept, and shuffle things with operators that don’t lose information entropy, it should retain the entropy from the original table and “mix well” with input seeds.
This new idea has some qualities undesirable of a lot of PRNGs, but has some very interesting side-effects I’d be excited to show.
- It may be slow. I haven’t considered this a lot, but it’s probably not as fast as most bit-twiddling style hashes.
- Depending on the implementation, it’s entirely reversable By design, it’s a reversible mapping. That is to say, if you have a given output, you can very quickly and very easily find every other possible output. You cannot figure out what the seed specifically was or which of those possibilities were actually outputted, although you could get the set of candidates.
Here’s the idea: Collatz
See, the collatz conjecture is the idea that there’s a mathematical function, which when used on a number (let’s say ‘n’) repeatedly, always goes to 1. The function is defined as:
If n is even | if n is odd |
---|---|
n/2 | n*3+1 |
So, take 5 for instance. 5 is odd, so the next number is 5*3+1 = 16
, and 16 is even, so it’s divided by 2. 16/2 = 8
. This keeps on happening 8/2 = 4
, 4/2 = 2
, 2/2 = 1
, and since it got to 1, 5 does not violate the collatz conjecture.
graph TB A["5"] -- odd --> B["16"] -- even --> C["8"] -- "even, etc..." --> D["1"]
Now, I did some thinking about the collatz conjecture. Specifically, whether it can be reversed. Take a number, starting at 1. You can always multiply it by 2, resulting in an even number. If the number minus one, divided by 3, is odd, then you can do that too. (An example of when you can’t do it is 25; 25 - 1 is divisible by 3, but results in 8. If we got 8 in collatz, we would be dividing it by 2, not multiplying it by 3 and adding one, so the reversal would be invalid).
Always | If (n - 1)/3 is an odd whole number |
---|---|
you may set new n to n*2 | You may set new n to (n-1)/3 |
Here’s a little mapping of the start of this function
graph TB A["1"] --> B["2"] --> C["4"] C -- "*2" --> D["8"] C -- "-1 then /3" --> A D -- "*2" --> E["16"] E -- "*2" --> F["32"] E -- "-1 then /3" --> G["5"]
Notice that, while there’s a couple places where a node points to two different things (In this graph, 4 and 16. 4 points to 1 and 8, whereas 16 points to 32 and 5), a node is always pointed at by a single arrow. This makes sense; the collatz function is determined by whether a number is even or odd, so in order for the same number to be arrived at twice in the reverse collatz function, the number would have to be both even and odd.
This does, along with a couple other features, guarantee an interesting ability. If we encode which “path” is taken as a binary string (Say, a left *2
operation being “0” and -1 and then /3 being 1
), you may start to see my vision. There’s a couple interesting quirks that have to be dealt with, first, though.
First of all, this will only work for some binary strings. I can’t, for instance, start with 01
, since that would result in 1/3
, which isn’t a whole number.
If I want to set it to any given arbitrary binary string. I could just ‘skip over’ those times it’s impossible, whenever (n-1)/3 is not whole and odd (If I choose an impossible option in the binary, the algorithm presumes I’m choosing the other option repeatedly until that one becomes possible), that almost solves it, but one more edge case must be resolved.
Numbers divisible by are a problem. There’s a fairly easy way to prove that, from there, we will only be able to multiply by 2, making binary strings that get to a multiple of 3 and then have 1
s impossible.
The proof is that a given multiple of 3, minus 1, is not going to be divisible by 3, because every multiple of 3 is exactly 3 away from the next and last; never 1. Multiplying by 2 leaves this fact unaffected, since multiplying any number by 2 does not change it’s other divisors.
SO, we can either exclude any option that leads to a number divisible by 3 and treat them like the other impossible cases, or we can only allow it in cases where there are no 1s afterwards. Either way, I believe this is the last internal constraint one must put before they have a pretty interesting construction: Given any binary number, you may apply the reverse collatz function to get to another number. This new number, if you apply the collatz function to it, will take precisely the steps that the binary number describes, excepting any that were “forced” as described earlier(Which can be determined easily).
This creates a 1:1 mapping between numbers assuming the collatz conjecture holds true, since the reverse function can’t get to the same number in two ways (It would imply it’s both even and odd), and the reverse function can’t have numbers impossible for it to arrive to, or else that number put through the normal collatz function would never get to 0, disproving the conjecture.
Reverse collatz from 1 to encode a number, collatz on the result to decode. The inspiration to make this was the fact that collatz is a well-known example of the butterfly effect in chaos theory; small alterations to the beginning will lead to largely different paths. Similarly, small alterations to the path, will lead to very different numbers.
There are a couple other really nice properties of collatz which make for an interesting function. No two binary inputs of the reverse collatz function will lead to the same result, since the collatz function never has 2 different paths for the same input number. And, since it’s unproven whether there is any number which doesn’t lead to 1, it’s a safe bet that any number you can think of can be represented.
My lazy way to turn this into a prng is just to seed it starting at 1, and then do some things to the number (Modulo, some predictable shuffle operations like binary rotation or xor, etc) and do reverse collatz starting at the resultant number(Changing it if it’s divisible by 3, first). This doesn’t intuitively seem reversible to me, but I mentioned that it may be reversible earlier because the modulo and shuffle operations weren’t yet considered, and I’m thinking they’re not required as long as you don’t mind the numbers getting /very/ big. I’m not sure if there’s a clever way to repeatedly reverse it without those, since it’s unclear how you’d gather the starting point after the first one, without knowing what the first one is and going from there.