I've got a databases table with ~50K rows inside it, each row signifies employment that should be done. I've got a program that extracts employment in the DB, does the task and puts the end result in the db. (this technique is running at this time)
Now I wish to allow several processing task to complete jobs but make sure that no task is performed two times (like a performance concern not too this makes other issues). Since the access is using a sproce, my current though would be to replace stated sproce with something which looks something similar to this
update tbl set owner=connection_id() where avalable and owner is null limit 1; select stuff from tbl where owner = connection_id();
BTW worker's tasks might drop there link between obtaining a job and posting the outcomes. Also, I do not expect the DB to even compare to being the bottle neck unless of course I mess that part up (~5 jobs each minute)
What are the difficulties with this? It is possible to better method of doing this?
Note: the "Database being an IPC anti-pattern" is just slightly apropos here because 1) I am not doing IPC (there's no process producing the rows, all of them already exist at this time) and a pair of) the main gripe referred to for your anti-pattern is it leads to needless strain on the DB as processes watch for messages (during my situation, if you will find no messages, everything can shutdown as things are done)
Some tips about what I have used effectively previously:
MsgQueue table schema
MsgId identity -- NOT NULL
MsgTypeCode varchar(20) -- NOT NULL
SourceCode varchar(20) -- process placing the content -- NULLable
Condition char(1) -- 'N'ew if queued, 'A'(ctive) if processing, 'C'ompleted, default 'N' -- NOT NULL CreateTime datetime -- default GETDATE() -- NOT NULL
Monosodium glutamate varchar(255) -- NULLable
Your message types are what you'd expect - messages that comply with an agreement between your process(es) placing and also the process(es) reading through, structured with XML or perhaps your other option of representation (JSON could be handy in some instances, for example).
Then -to-n processes could be placing, and -to-n processes could be reading through and processing the messages, Each reading through process typically handles just one message type. Multiple cases of a procedure type could be running for load-balancing.
The readers pulls one message and changes the condition to "A"ctive although it creates it. When it is tried it changes the condition to "C"omplete. It may remove the content or otherwise based on whether you need to keep your audit trail. Messages of Condition = 'N' are drawn in MsgType/Timestamp order, so there's a catalog on MsgType + Condition + CreateTime.
Condition for "E"rror.
Column for Readers process code.
Timestamps for condition transitions.
It has provided a pleasant, scalable, visible, simple mechanism for doing numerous things as if you are explaining. For those who have a fundamental knowledge of databases, it's pretty foolproof and extensible.
Rather than getting owner = null if this is not possessed, you need to place it to some fake nobody record rather. Trying to find null does not limit the index, you may finish track of a table scan. (to oracle, SQL server may be different)
Just just as one technology change, you may think about using MSMQ or something like that similar.
All of your jobs / threads could query the texting queue to ascertain if a brand new job was available. Because the action of reading through a note removes it in the stack, you're made certain that just one job / thread would obtain the message.
Obviously, this really is presuming you're using a Microsoft platform.
You are attempting to implement p "Database as IPC" antipattern. Look up to know why you need to consider redecorating your software correctly.