This feels overengineered and it kinda is, but other efforts to
inroduce sequential naming/UUID naming haven't been that fruitful
either.
10 character random "hash" i now changed to.
1. first character - last character in UUID4 ID of request/job
2. three characters - derived from current timestamp.
4. 6 characters - random data.
This satisfies all three requirements:
1. Readers - temporal locality should result in spatial locality on disk. (fewer pages accessed)
2. Single writer - temporal locality should result in spatial locality. (fewer dirty pages)
3. Multiple writers - temporal locality should NOT result in spatial locality. (less lock contention)
Mostly concludes https://github.com/frappe/frappe/pull/25309 and https://github.com/frappe/frappe/pull/28349
Rough probabiliy numbers
Assumptions:
- Unique per worker prefix - 16 (uuid's base16 version)
- Rough time spent generating names - 10% of request (very very conservative estimate)
Probability(collision) = P(at least one prefix collision) * P(time collision)
Probability(collision) = (1 - p(all different)) * 10%
Probability(collision) = (1 - (16! / 16-N! )/ 16^N ) * 10%
| N (concurrency) | Probability(collision) |
| 1 | 0.0% |
| 2 | 0.6% |
| 3 | 1.8% |
| 4 | 3.3% |
| 5 | 5.0% |
| 6 | 6.6% |
| 7 | 7.9% |
| 8 | 8.8% |