Psy breaks YouTube... YouTube repeats mistake!
I had to laugh at this one.
YouTube used signed 32 bit integers for their view counters. Frankly, I generally wouldn't see a problem with that. For a video to be viewed 2 Billion+ times means it would have been viewed over a quarter as many times as there are people on Earth.
So yeah, kind of surprising that any video would hit that value.
But here is the thing... you can't have negative views, so using an unsigned integer automatically makes more sense. In fact, an unsigned integer effectively doubles the limit before it breaks. Moving the mark from 1/4 views per living human to 1/2. While it isn't an order of magnitude better, it isn't a shabby improvement either.
I don't blame them for not starting with an unsigned integer for exactly that reason though. But, if you're going to change data structures might as well get the most bang for your buck. So... they went to 64 bit integers. This is a huge improvement.
BUT... they still went with a signed 64-bit integer.
WHY?!?!?!?! You just finished hitting a cap you never thought you would hit... so why the hell would you move to a larger memory footprint and insist on wasting half of the available values in a range you will never touch?
I don't fault people for making a mistake once. Especially not this mistake. Regularly people will choose int as a catch all for any whole number numeric values. And that is fine. When you first start an application or web site you can't really know how big you'll grow and what the demands will become. But, once you reach those limits you should be thinking more intelligently about how you are storing data.
I would even argue that while moving up to 64-bit integers is more future proofing, moving to unsigned 32-bit integers would have likely addressed the issue for years (hell it took Gangnam Style well over a year to get that many views and it isn't getting more popular with time). And, it wouldn't have increased memory/db footprint. Upping it to 64-bit doubles the memory footprint for the counters.
Granted... compared the video data, a single 64-bit int is nothing but a speck of dust in the DB space consumed and the limit on signed 64-bit integers is *probably* large enough that YouTube will cease to exist long before they reach that cap. The point is though that, they probably felt the same way about 32-bit signed integers. So, why make not use the most efficient solution possible and double your capacity?
If they manage to hit their cap on 64-bit integers before 128-bit processors and OS's become available I'll have an even bigger laugh.
Are there reasons to go with signed ints given everything above? Sure. Maybe negative values are used in some "special" cases. But, good coding practices would say that a page view counter should only track page views. Could also be the case that depending on how the data is stored that they didn't want to bother with a conversion if necessary. Having a hard time thinking of cases where converting from a signed 32-bit int to an unsigned 64-bit int would be a challenge... but perhaps. And even then it is simply slacking instead of doing it right.
YouTube used signed 32 bit integers for their view counters. Frankly, I generally wouldn't see a problem with that. For a video to be viewed 2 Billion+ times means it would have been viewed over a quarter as many times as there are people on Earth.
So yeah, kind of surprising that any video would hit that value.
But here is the thing... you can't have negative views, so using an unsigned integer automatically makes more sense. In fact, an unsigned integer effectively doubles the limit before it breaks. Moving the mark from 1/4 views per living human to 1/2. While it isn't an order of magnitude better, it isn't a shabby improvement either.
I don't blame them for not starting with an unsigned integer for exactly that reason though. But, if you're going to change data structures might as well get the most bang for your buck. So... they went to 64 bit integers. This is a huge improvement.
BUT... they still went with a signed 64-bit integer.
WHY?!?!?!?! You just finished hitting a cap you never thought you would hit... so why the hell would you move to a larger memory footprint and insist on wasting half of the available values in a range you will never touch?
I don't fault people for making a mistake once. Especially not this mistake. Regularly people will choose int as a catch all for any whole number numeric values. And that is fine. When you first start an application or web site you can't really know how big you'll grow and what the demands will become. But, once you reach those limits you should be thinking more intelligently about how you are storing data.
I would even argue that while moving up to 64-bit integers is more future proofing, moving to unsigned 32-bit integers would have likely addressed the issue for years (hell it took Gangnam Style well over a year to get that many views and it isn't getting more popular with time). And, it wouldn't have increased memory/db footprint. Upping it to 64-bit doubles the memory footprint for the counters.
Granted... compared the video data, a single 64-bit int is nothing but a speck of dust in the DB space consumed and the limit on signed 64-bit integers is *probably* large enough that YouTube will cease to exist long before they reach that cap. The point is though that, they probably felt the same way about 32-bit signed integers. So, why make not use the most efficient solution possible and double your capacity?
If they manage to hit their cap on 64-bit integers before 128-bit processors and OS's become available I'll have an even bigger laugh.
Are there reasons to go with signed ints given everything above? Sure. Maybe negative values are used in some "special" cases. But, good coding practices would say that a page view counter should only track page views. Could also be the case that depending on how the data is stored that they didn't want to bother with a conversion if necessary. Having a hard time thinking of cases where converting from a signed 32-bit int to an unsigned 64-bit int would be a challenge... but perhaps. And even then it is simply slacking instead of doing it right.
Comments
Post a Comment