What is the latency WITHIN a data center? I ask this assuming there are orders of magnitude of difference
I am trying to figure out something that I just cannot find a good answer to.
If I have say a REDIS cache (or some external in-memory cache) sitting in a data center, and an application server sitting in the same data center, what will be the speed of the network connection (latency, throughput) for reading data between these two machines?
Will the network "speed", for example, be still at least an order of magnitude higher than the speed of the RAM that is seeking my data out of the cache on REDIS?
My ultimate question is -- is having this all sitting in memory on REDIS actually providing any utility? Contrasted with if REDIS was caching this all to an SSD instead? Memory is expensive. If the network is indeed not a bottleneck WITHIN the data center, then the memory has value. Otherwise, it does not.
I guess my general question is despite the vast unknowns in data centers and the inability to generalize as well as the variances, are we talking sufficient orders of magnitude between memory latency in a computer system and even the best networks internal to a DC that the memory's reduced latencies don't provide a significant performance improvement? I get that that there are many variables, but how close is it? Is it so close that these variables do matter? For example, take take a hyperbolic stance on it, a tape drive is WAY slower than network, so tape is not ideal for a cache.
cache
New contributor
add a comment |
I am trying to figure out something that I just cannot find a good answer to.
If I have say a REDIS cache (or some external in-memory cache) sitting in a data center, and an application server sitting in the same data center, what will be the speed of the network connection (latency, throughput) for reading data between these two machines?
Will the network "speed", for example, be still at least an order of magnitude higher than the speed of the RAM that is seeking my data out of the cache on REDIS?
My ultimate question is -- is having this all sitting in memory on REDIS actually providing any utility? Contrasted with if REDIS was caching this all to an SSD instead? Memory is expensive. If the network is indeed not a bottleneck WITHIN the data center, then the memory has value. Otherwise, it does not.
I guess my general question is despite the vast unknowns in data centers and the inability to generalize as well as the variances, are we talking sufficient orders of magnitude between memory latency in a computer system and even the best networks internal to a DC that the memory's reduced latencies don't provide a significant performance improvement? I get that that there are many variables, but how close is it? Is it so close that these variables do matter? For example, take take a hyperbolic stance on it, a tape drive is WAY slower than network, so tape is not ideal for a cache.
cache
New contributor
add a comment |
I am trying to figure out something that I just cannot find a good answer to.
If I have say a REDIS cache (or some external in-memory cache) sitting in a data center, and an application server sitting in the same data center, what will be the speed of the network connection (latency, throughput) for reading data between these two machines?
Will the network "speed", for example, be still at least an order of magnitude higher than the speed of the RAM that is seeking my data out of the cache on REDIS?
My ultimate question is -- is having this all sitting in memory on REDIS actually providing any utility? Contrasted with if REDIS was caching this all to an SSD instead? Memory is expensive. If the network is indeed not a bottleneck WITHIN the data center, then the memory has value. Otherwise, it does not.
I guess my general question is despite the vast unknowns in data centers and the inability to generalize as well as the variances, are we talking sufficient orders of magnitude between memory latency in a computer system and even the best networks internal to a DC that the memory's reduced latencies don't provide a significant performance improvement? I get that that there are many variables, but how close is it? Is it so close that these variables do matter? For example, take take a hyperbolic stance on it, a tape drive is WAY slower than network, so tape is not ideal for a cache.
cache
New contributor
I am trying to figure out something that I just cannot find a good answer to.
If I have say a REDIS cache (or some external in-memory cache) sitting in a data center, and an application server sitting in the same data center, what will be the speed of the network connection (latency, throughput) for reading data between these two machines?
Will the network "speed", for example, be still at least an order of magnitude higher than the speed of the RAM that is seeking my data out of the cache on REDIS?
My ultimate question is -- is having this all sitting in memory on REDIS actually providing any utility? Contrasted with if REDIS was caching this all to an SSD instead? Memory is expensive. If the network is indeed not a bottleneck WITHIN the data center, then the memory has value. Otherwise, it does not.
I guess my general question is despite the vast unknowns in data centers and the inability to generalize as well as the variances, are we talking sufficient orders of magnitude between memory latency in a computer system and even the best networks internal to a DC that the memory's reduced latencies don't provide a significant performance improvement? I get that that there are many variables, but how close is it? Is it so close that these variables do matter? For example, take take a hyperbolic stance on it, a tape drive is WAY slower than network, so tape is not ideal for a cache.
cache
cache
New contributor
New contributor
New contributor
asked 6 hours ago
Neeraj MurarkaNeeraj Murarka
185
185
New contributor
New contributor
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
There are several versions of the "latency charts everyone should know" such as:
- https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
- https://gist.github.com/jboner/2841832
- https://computers-are-fast.github.io/
The thing is, in reality, there is more than just latency. It's a combination of factors.
So, what's the network latency within a data center? Latency, well I would say it's "always" below 1ms. Is it faster than RAM? No. Is it close to RAM? I don't think so.
But the question remains, is it relevant. Is that the datum you need to know? Your question make sense to me. As everything has a cost, should you get more RAM so that all the data can stay in RAM or it's ok to read from disk from time to time.
Your "assumption" is that if the network latency is higher (slower) than the speed of the SSD, you won't be gaining by having all the data in RAM as you will have the slow on the network.
And it would appear so. But, you also have to take into account concurrency. If you receive 1,000 requests for the data at once, can the disk do 1,000 concurrent requests? Of course not, so how long will it take to serve up those 1,000 requests? Compared to RAM?
It's hard to boil it down to a single factor such as heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM.
Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck.
But remember that your disk is doing many other things, your process isn't the only process on the machine, your network may carry different things, etc.
Also, not all disk activity mean network traffic. The database query coming from an application to the database server is only very minimal network traffic. The response from the database server may be very small (a single number) or very large (thousand of rows with multiple fields). To perform the operation, a server (database server or not) may need to do multiple disk seeks, reads and writes yet only send a very small bit back over the network. It's definitely not one-for-one network-disk-RAM.
So far I avoided some details of your question - specifically, the Redis part.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. - https://redis.io/
OK, so that means everything is in memory. Sorry, this fast SSD drive won't help you here. Redis can persist data to disk, so it can be loaded into RAM after a restart. That's only to not "lose" data or have to repopulate a cold cache after a restart. So in this case, you'll have to use the RAM, no matter what. You'll have to have enough RAM to contain your data set. Not enough RAM and I guess your OS will use swap
- probably not a good idea.
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
add a comment |
There are many layers of cache in computer systems. Inserting one at the application layer can be beneficial, caching API and database queries. And possibly temporary data like user sessions.
Data stores like Redis provide such a service over a network (fast) or UNIX socket (even faster), much like you would use a database.
You need to measure how your application actually performs, but let's make up an example. Say a common user request does 5 API queries that take 50 ms each. 250 ms is user detectable latency. Contrast to caching the results. Even if the cache is in a different availability zone across town (not optimal), hits are probably 10 ms at most. Which would be a 5x speedup.
In reality, the database and storage systems have their own caches as well. However, usually it is faster to get a pre-fetched result than to go through the database engine and storage system layers again. Also, the caching layer can take significant load off of the database behind it.
For an example of such a cache in production, look no further than the Stack Overflow infrastructure blog on architecture. Hundreds of thousands of HTTP requests generating billions of Redis hits is quite significant.
Memory is expensive.
DRAM at 100 ns access times is roughly 100x faster than solid state permanent storage. It is relatively inexpensive for this performance. For many applications, a bit more RAM buys valuable speed and response time.
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "2"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Neeraj Murarka is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f953169%2fwhat-is-the-latency-within-a-data-center-i-ask-this-assuming-there-are-orders-o%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are several versions of the "latency charts everyone should know" such as:
- https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
- https://gist.github.com/jboner/2841832
- https://computers-are-fast.github.io/
The thing is, in reality, there is more than just latency. It's a combination of factors.
So, what's the network latency within a data center? Latency, well I would say it's "always" below 1ms. Is it faster than RAM? No. Is it close to RAM? I don't think so.
But the question remains, is it relevant. Is that the datum you need to know? Your question make sense to me. As everything has a cost, should you get more RAM so that all the data can stay in RAM or it's ok to read from disk from time to time.
Your "assumption" is that if the network latency is higher (slower) than the speed of the SSD, you won't be gaining by having all the data in RAM as you will have the slow on the network.
And it would appear so. But, you also have to take into account concurrency. If you receive 1,000 requests for the data at once, can the disk do 1,000 concurrent requests? Of course not, so how long will it take to serve up those 1,000 requests? Compared to RAM?
It's hard to boil it down to a single factor such as heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM.
Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck.
But remember that your disk is doing many other things, your process isn't the only process on the machine, your network may carry different things, etc.
Also, not all disk activity mean network traffic. The database query coming from an application to the database server is only very minimal network traffic. The response from the database server may be very small (a single number) or very large (thousand of rows with multiple fields). To perform the operation, a server (database server or not) may need to do multiple disk seeks, reads and writes yet only send a very small bit back over the network. It's definitely not one-for-one network-disk-RAM.
So far I avoided some details of your question - specifically, the Redis part.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. - https://redis.io/
OK, so that means everything is in memory. Sorry, this fast SSD drive won't help you here. Redis can persist data to disk, so it can be loaded into RAM after a restart. That's only to not "lose" data or have to repopulate a cold cache after a restart. So in this case, you'll have to use the RAM, no matter what. You'll have to have enough RAM to contain your data set. Not enough RAM and I guess your OS will use swap
- probably not a good idea.
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
add a comment |
There are several versions of the "latency charts everyone should know" such as:
- https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
- https://gist.github.com/jboner/2841832
- https://computers-are-fast.github.io/
The thing is, in reality, there is more than just latency. It's a combination of factors.
So, what's the network latency within a data center? Latency, well I would say it's "always" below 1ms. Is it faster than RAM? No. Is it close to RAM? I don't think so.
But the question remains, is it relevant. Is that the datum you need to know? Your question make sense to me. As everything has a cost, should you get more RAM so that all the data can stay in RAM or it's ok to read from disk from time to time.
Your "assumption" is that if the network latency is higher (slower) than the speed of the SSD, you won't be gaining by having all the data in RAM as you will have the slow on the network.
And it would appear so. But, you also have to take into account concurrency. If you receive 1,000 requests for the data at once, can the disk do 1,000 concurrent requests? Of course not, so how long will it take to serve up those 1,000 requests? Compared to RAM?
It's hard to boil it down to a single factor such as heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM.
Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck.
But remember that your disk is doing many other things, your process isn't the only process on the machine, your network may carry different things, etc.
Also, not all disk activity mean network traffic. The database query coming from an application to the database server is only very minimal network traffic. The response from the database server may be very small (a single number) or very large (thousand of rows with multiple fields). To perform the operation, a server (database server or not) may need to do multiple disk seeks, reads and writes yet only send a very small bit back over the network. It's definitely not one-for-one network-disk-RAM.
So far I avoided some details of your question - specifically, the Redis part.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. - https://redis.io/
OK, so that means everything is in memory. Sorry, this fast SSD drive won't help you here. Redis can persist data to disk, so it can be loaded into RAM after a restart. That's only to not "lose" data or have to repopulate a cold cache after a restart. So in this case, you'll have to use the RAM, no matter what. You'll have to have enough RAM to contain your data set. Not enough RAM and I guess your OS will use swap
- probably not a good idea.
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
add a comment |
There are several versions of the "latency charts everyone should know" such as:
- https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
- https://gist.github.com/jboner/2841832
- https://computers-are-fast.github.io/
The thing is, in reality, there is more than just latency. It's a combination of factors.
So, what's the network latency within a data center? Latency, well I would say it's "always" below 1ms. Is it faster than RAM? No. Is it close to RAM? I don't think so.
But the question remains, is it relevant. Is that the datum you need to know? Your question make sense to me. As everything has a cost, should you get more RAM so that all the data can stay in RAM or it's ok to read from disk from time to time.
Your "assumption" is that if the network latency is higher (slower) than the speed of the SSD, you won't be gaining by having all the data in RAM as you will have the slow on the network.
And it would appear so. But, you also have to take into account concurrency. If you receive 1,000 requests for the data at once, can the disk do 1,000 concurrent requests? Of course not, so how long will it take to serve up those 1,000 requests? Compared to RAM?
It's hard to boil it down to a single factor such as heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM.
Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck.
But remember that your disk is doing many other things, your process isn't the only process on the machine, your network may carry different things, etc.
Also, not all disk activity mean network traffic. The database query coming from an application to the database server is only very minimal network traffic. The response from the database server may be very small (a single number) or very large (thousand of rows with multiple fields). To perform the operation, a server (database server or not) may need to do multiple disk seeks, reads and writes yet only send a very small bit back over the network. It's definitely not one-for-one network-disk-RAM.
So far I avoided some details of your question - specifically, the Redis part.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. - https://redis.io/
OK, so that means everything is in memory. Sorry, this fast SSD drive won't help you here. Redis can persist data to disk, so it can be loaded into RAM after a restart. That's only to not "lose" data or have to repopulate a cold cache after a restart. So in this case, you'll have to use the RAM, no matter what. You'll have to have enough RAM to contain your data set. Not enough RAM and I guess your OS will use swap
- probably not a good idea.
There are several versions of the "latency charts everyone should know" such as:
- https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html
- https://gist.github.com/jboner/2841832
- https://computers-are-fast.github.io/
The thing is, in reality, there is more than just latency. It's a combination of factors.
So, what's the network latency within a data center? Latency, well I would say it's "always" below 1ms. Is it faster than RAM? No. Is it close to RAM? I don't think so.
But the question remains, is it relevant. Is that the datum you need to know? Your question make sense to me. As everything has a cost, should you get more RAM so that all the data can stay in RAM or it's ok to read from disk from time to time.
Your "assumption" is that if the network latency is higher (slower) than the speed of the SSD, you won't be gaining by having all the data in RAM as you will have the slow on the network.
And it would appear so. But, you also have to take into account concurrency. If you receive 1,000 requests for the data at once, can the disk do 1,000 concurrent requests? Of course not, so how long will it take to serve up those 1,000 requests? Compared to RAM?
It's hard to boil it down to a single factor such as heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM.
Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck.
But remember that your disk is doing many other things, your process isn't the only process on the machine, your network may carry different things, etc.
Also, not all disk activity mean network traffic. The database query coming from an application to the database server is only very minimal network traffic. The response from the database server may be very small (a single number) or very large (thousand of rows with multiple fields). To perform the operation, a server (database server or not) may need to do multiple disk seeks, reads and writes yet only send a very small bit back over the network. It's definitely not one-for-one network-disk-RAM.
So far I avoided some details of your question - specifically, the Redis part.
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. - https://redis.io/
OK, so that means everything is in memory. Sorry, this fast SSD drive won't help you here. Redis can persist data to disk, so it can be loaded into RAM after a restart. That's only to not "lose" data or have to repopulate a cold cache after a restart. So in this case, you'll have to use the RAM, no matter what. You'll have to have enough RAM to contain your data set. Not enough RAM and I guess your OS will use swap
- probably not a good idea.
edited 2 hours ago
answered 4 hours ago
ETLETL
5,18211843
5,18211843
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
add a comment |
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
Thanks. This is indeed useful. There are indeed many contextual variances here that have a bearing on this. If we ignore heavy loads for a moment, it seems from your answer that indeed, network latency is the bottleneck, so the additional latency of SSD vs RAM is just not significant enough to matter. But now, if we take into account heavy loads, that SSD's latency differences relative to the RAM start to get compounded, and now, the RAM will shine. Is this what it comes down to then?
– Neeraj Murarka
4 hours ago
1
1
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
It's hard to boil it down to a single factor of heavy loads. But yes, if you had a single operation going, the latency of the network is such that you would probably not notice the difference of SSD vs RAM. Just like until 12Gbps disk showed up on the market, a 10Gbps network link would not be overloaded by a single stream as the disk were the bottleneck. But remember that your disk is doing many other things, your process isn't the only process on the machine, etc.
– ETL
3 hours ago
add a comment |
There are many layers of cache in computer systems. Inserting one at the application layer can be beneficial, caching API and database queries. And possibly temporary data like user sessions.
Data stores like Redis provide such a service over a network (fast) or UNIX socket (even faster), much like you would use a database.
You need to measure how your application actually performs, but let's make up an example. Say a common user request does 5 API queries that take 50 ms each. 250 ms is user detectable latency. Contrast to caching the results. Even if the cache is in a different availability zone across town (not optimal), hits are probably 10 ms at most. Which would be a 5x speedup.
In reality, the database and storage systems have their own caches as well. However, usually it is faster to get a pre-fetched result than to go through the database engine and storage system layers again. Also, the caching layer can take significant load off of the database behind it.
For an example of such a cache in production, look no further than the Stack Overflow infrastructure blog on architecture. Hundreds of thousands of HTTP requests generating billions of Redis hits is quite significant.
Memory is expensive.
DRAM at 100 ns access times is roughly 100x faster than solid state permanent storage. It is relatively inexpensive for this performance. For many applications, a bit more RAM buys valuable speed and response time.
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
add a comment |
There are many layers of cache in computer systems. Inserting one at the application layer can be beneficial, caching API and database queries. And possibly temporary data like user sessions.
Data stores like Redis provide such a service over a network (fast) or UNIX socket (even faster), much like you would use a database.
You need to measure how your application actually performs, but let's make up an example. Say a common user request does 5 API queries that take 50 ms each. 250 ms is user detectable latency. Contrast to caching the results. Even if the cache is in a different availability zone across town (not optimal), hits are probably 10 ms at most. Which would be a 5x speedup.
In reality, the database and storage systems have their own caches as well. However, usually it is faster to get a pre-fetched result than to go through the database engine and storage system layers again. Also, the caching layer can take significant load off of the database behind it.
For an example of such a cache in production, look no further than the Stack Overflow infrastructure blog on architecture. Hundreds of thousands of HTTP requests generating billions of Redis hits is quite significant.
Memory is expensive.
DRAM at 100 ns access times is roughly 100x faster than solid state permanent storage. It is relatively inexpensive for this performance. For many applications, a bit more RAM buys valuable speed and response time.
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
add a comment |
There are many layers of cache in computer systems. Inserting one at the application layer can be beneficial, caching API and database queries. And possibly temporary data like user sessions.
Data stores like Redis provide such a service over a network (fast) or UNIX socket (even faster), much like you would use a database.
You need to measure how your application actually performs, but let's make up an example. Say a common user request does 5 API queries that take 50 ms each. 250 ms is user detectable latency. Contrast to caching the results. Even if the cache is in a different availability zone across town (not optimal), hits are probably 10 ms at most. Which would be a 5x speedup.
In reality, the database and storage systems have their own caches as well. However, usually it is faster to get a pre-fetched result than to go through the database engine and storage system layers again. Also, the caching layer can take significant load off of the database behind it.
For an example of such a cache in production, look no further than the Stack Overflow infrastructure blog on architecture. Hundreds of thousands of HTTP requests generating billions of Redis hits is quite significant.
Memory is expensive.
DRAM at 100 ns access times is roughly 100x faster than solid state permanent storage. It is relatively inexpensive for this performance. For many applications, a bit more RAM buys valuable speed and response time.
There are many layers of cache in computer systems. Inserting one at the application layer can be beneficial, caching API and database queries. And possibly temporary data like user sessions.
Data stores like Redis provide such a service over a network (fast) or UNIX socket (even faster), much like you would use a database.
You need to measure how your application actually performs, but let's make up an example. Say a common user request does 5 API queries that take 50 ms each. 250 ms is user detectable latency. Contrast to caching the results. Even if the cache is in a different availability zone across town (not optimal), hits are probably 10 ms at most. Which would be a 5x speedup.
In reality, the database and storage systems have their own caches as well. However, usually it is faster to get a pre-fetched result than to go through the database engine and storage system layers again. Also, the caching layer can take significant load off of the database behind it.
For an example of such a cache in production, look no further than the Stack Overflow infrastructure blog on architecture. Hundreds of thousands of HTTP requests generating billions of Redis hits is quite significant.
Memory is expensive.
DRAM at 100 ns access times is roughly 100x faster than solid state permanent storage. It is relatively inexpensive for this performance. For many applications, a bit more RAM buys valuable speed and response time.
answered 31 mins ago
John MahowaldJohn Mahowald
7,1281713
7,1281713
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
add a comment |
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
Can you please clarify how you calculated that each of those 5 API queries take 50 ms each? Is that under the guise of the application hitting up the database and doing the query and calculating the result set, vs just hitting a cache across town that happens to have cached the query string itself as the key, and have a cached copy of that result set?
– Neeraj Murarka
28 mins ago
1
1
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
I made those numbers up, but yes. Doing a query and computing a result again is likely to be slower than getting that pre-computed result. Implementations like Redis tend to be in-memory for simplicity and speed. Traversing an IP network or UNIX socket transport can also be quite fast. All that said, this caching stuff is not required for every design.
– John Mahowald
9 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
Understood. I think I more or less understand. It seems that in alot of cases, but not all the time, even traversing out of the data center to a nearby cache that is maybe in the same US state (or Canadian province, etc) (maybe region is a good semantic) can often be a great advantage over the process trying to re-calculate the value algorithmically from its own local database, if it does in fact result in a cache hit. But then, the cache that might be sitting remote does not offer alot of value by being in-memory. It may as well be SSD-based.
– Neeraj Murarka
2 mins ago
add a comment |
Neeraj Murarka is a new contributor. Be nice, and check out our Code of Conduct.
Neeraj Murarka is a new contributor. Be nice, and check out our Code of Conduct.
Neeraj Murarka is a new contributor. Be nice, and check out our Code of Conduct.
Neeraj Murarka is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f953169%2fwhat-is-the-latency-within-a-data-center-i-ask-this-assuming-there-are-orders-o%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e) {
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom)) {
StackExchange.using('gps', function() { StackExchange.gps.track('embedded_signup_form.view', { location: 'question_page' }); });
$window.unbind('scroll', onScroll);
}
};
$window.on('scroll', onScroll);
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown