Generating Unique Ids in PHP – A scheme for UUIDs generation in a distributed application
I was recently involved with a project which had 4 nodes running PHP code and a sharded MySQL database in the backend. The PHP nodes create data objects in a distributed manner and then persist them to the database shards.
Now, each object needs a unique identifier (key). MySQL auto-increment Ids could not be used here since two database shards could pick the same Ids resulting in a collision. We needed an efficient scheme for simultaneously generating unique identifiers on multiple hosts. A few additional constraints apply:
Low Collision Probability:
- The UUIDs should offer strong guarantees of uniqueness. Even if two nodes create UUIDs at the same instant of time, there would be a very, very low probability of a collision.
Scalable:
- There would be no central node / authority to generate them. A central node would be a single point of failure. It would also be a bottleneck as our cluster grows.
- The UUIDs should be generated in a distributed manner. Any web-layer node can generate the UUID locally and this should be efficient to generate.
Length of the UUID:
- We were not trying to achieve a theoretical uniqueness here. Yet we were looking at 20-50 million unique objects in our database.
- An 18-20 digit UUID seemed reasonable.
Not Cryptographically Secure:
- We were not particularly looking for something which is hard to predict or something cryptographically secure. So if the Ids were monotonically increasing or somewhat easier to predict, it was acceptable.
Non-strict Timestamp Synchronization:
- Nodes which generate UUIDs might not be strictly synchronized in time – however they would be within 5-10 seconds of each other.
The PHP Solution
PHP supports the uniqid() method (since PHP 4), which generates a unique id based on the current microsecond timestamp of the local system.
1 2 3 4 | for($i=0; $i<10; $i++){ echo uniqid(); echo "<br/>"; } |
It returns a 13 character hex string – monotonically increasing timestamp values:
4aa5c7f2c2d33
4aa5c7f2c2d4a
4aa5c7f2c2d4e
4aa5c7f2c2d51…
In a scenario where multiple nodes simultaneously generate the unique Id, there is a small probability of a collision. We add a 4 digit random prefix, to reduce the collision probability further.
1 2 3 4 5 6 7 | // Generate a random prefix. $rand = md5(rand()); $randomPrefix = substr($rand, 0, 4); // Generate a UniqId having the random prefix. // (4 digit prefix + 13 digit Uniq) is good enough for us. $uniq = uniqid($randomPrefix); |
Modulo and Shard Numbers
In our scheme, the shard number of an object was determined based on the modulus of the unique Id. Thus:
Shard Number = UniqId % TOTAL_SHARDS
For very large numbers the PHP mod (%) operator results in an overflow. So we consider only the 5 LSB digits for computing the modulo value.
1 2 3 4 5 6 7 8 9 10 11 12 | function getModulo($uniqId) { // Take 5 LSB digits only. $uuidLsb = substr($uniqId, -5); // Convert Hex string to Int. Mod only works on Ints. $intUuidLsb = (int) hexdec($uuidLsb); // Compute Mod $mod = $intUuidLsb % TOTAL_SHARDS; return $mod; } |
UUID Generation – Final Cut
We extend the UUID scheme a bit further in order to help us quickly determine the shard number from a given UUID string. We prefix the shard number (mod value) to every UUID. The final version of our scheme looks like this:
- Generate random number.
- Generate UniqId with the random number prefix.
- Generate UUID = mod(UniqId) + UniqId
Here is the final version of our code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | public function generateUUID() { // Generate a random prefix. $rand = md5(rand()); $randomPrefix = substr($rand, 0, 4); // Generate a UniqId having the random prefix. // (4 digit prefix + 13 digit Uniq) is good enough for us. $uniq = uniqid($randomPrefix); // Compute modulo. $modVal = $this->getModulo($uniq); // UUID = (1 digit mod val + 4 digit prefix + 13 digit Uniq). $uuid = $modVal.$uniq; return $uuid; } private function getModulo($uuid) { // Take 5 LSB digits only for the modulo computation. $uuidLsb = substr($uuid, -5); // Convert Hex string to Int. Mod only works on Ints. $intUuidLsb = (int) hexdec($uuidLsb); // Compute Mod $mod = $intUuidLsb % TOTAL_SHARDS; return $mod; } |
Please Note that this scheme is not compliant with the UUID RFC 4122, but should suffice for web applications in general.
Tags: PHP, scalability, webapps

12. March 2010 at 07:02
It’s good, it’s useful (as usual), actionable and concise. Love it.
29. March 2010 at 08:40
Very super information.
31. March 2010 at 07:16
My name is Piter Jankovich. Only want to tell, that your blog is really cool
And want to ask you: is this blog your hobby?
P.S. Sorry for my bad english
8. June 2010 at 10:29
You should really think about steering this blog into a dominant authority. You obviously have a good grasp of the areas and you could definitely even make a buck or three off of some ads. I would dive into following recent topics and raising the volume of write ups you put up and I bet you’d start earning some good traffic soon. Just an idea, good luck in whatever you do!
1. July 2010 at 20:59
I wanted to thank you for this excellent read!! I definitely loved every little bit of it. I have you bookmarked your site to look at the latest stuff you post.
8. July 2010 at 15:09
Excellent read, I just passed this onto a colleague who was doing a little research on that. And he actually bought me lunch because I found it for him smile So let me rephrase that: Thanks for lunch!
11. July 2010 at 00:03
I hope you will keep updating your content constantly as you have one dedicated reader here.
11. July 2010 at 12:40
Hi there could I reference some of the information from this post if I reference you with a link back to your site?