I'm doing a bit of loading from an oracle db, using ODP.Internet.
In the current implementation the code does something like:
query entityIds to load based on criteria foreach entityId load attributes query geometries that exist foreach geometry that exists load geometry next next
once the DB is around the local network criteria which load 133 organizations takes a matter of seconds to load all 133 organizations.
Once the db is really a remote db located on the VM inside a data center on the other hand around the globe this takes about 3.5 mins to load all of them.
The particularly slow bit appears to become the querying the geometry. In initial testing (in TOAD - not within the service loading code) it appears to consider a couple of secs to load the geometry for any single entity while using remote machine. When we alter the query to load all of the geometries in one go, still it appears to consider 2 secs. This kind of suggests it is not the network overhead (as the quantity of data being came back is a lot more for that query which returns all of the geometries, however the time is identical).
Is kind of performance overhead for any remote db versus local expected? How come doing each query individually take a lot more than doing them all-in-one go? Can there be anything we are able to do in order to mitigate this (aside from do all of the queries all at once)?
You are most likely approaching around the distinction between bandwidth and latency.
Latency it's time taken for any single round-trip, whereas bandwidth is the quantity of data that may flow through on the with time period (eg 1 second).
If you are running 200 queries (from client-side code, not from the saved proc), then regardless of how much data gets into each query, you're going to get 200 round-outings
Normal latency for that other finish around the globe is about half another In my opinion - so for 200 organizations retrieved individually, about 100 seconds.
Individuals amounts don't quite match yours, so there might be even greater latency (is dependent on a variety of network factors). I'd normally search for query/research overhead around the database server (presuming there's an indexing problem), but you've already pointed out that in your area there's no significant overhead (most probably with similar data?).
You are realizing network latency.
There should be a minumum of one round-trip using the server every time you create a query.
When the server is "far", i.e. 500ms ping, which means at least one second of latency per query. This really is uncompressible - even when the query returns no rows, that 1s hit may happen.
The bandwidth from the network is definitely an unrelated characteristic. In case your bandwidth is high, you will not watch a large distinction between moving a sizable dataset along with a small one. But both will still suffer the latency hit in much the same way.
I discovered this information has interesting (if dated) information: It's the latency, Stupid.