I've got a large database and wish to implement an element which may allow a person to perform a bulk update of knowledge. The consumer downloads an stand out file, helps make the changes and also the system accepts the stand out file.

  1. The consumer utilizes a web interface (ASP.Internet) to download the information from database to Stand out.
  2. User modifies the Stand out file. Only certain information is permitted to become modified as other map in to the DB.
  3. When the user is satisfied using their changes they upload the transformed Stand out file with the ASP.Internet interface.
  4. Now it is the server's job to suck data in the Stand out file (using Gembox) and validate the information from the database (this is when I am getting the problem)
  5. Validation answers are proven on another ASP.Internet page after validation is done. Validation is soft and thus hard fails only occur when say a catalog mapping into DB is missing. (Missing data causes ignore, etc)
  6. User can decide if the actions that'll be taken work, in accepting these the machine will apply the alterations. (Add, Modify, or Ignore)

Before using the alterations and/or additions the consumer makes, the information should be validated to prevent mistakes through the user. (The accidentally erased dates that they did not mean to)

It isn't implausible for that rows that require upgrading to achieve over 65k.

Now you ask ,: What's the easiest method to parse the information to complete validation and to develop the modification and addition sets?

Basically load all data the stand out data should be validated against into memory I would unnecessarily be affecting the already memory hungry application. Basically perform a database hit for each tuple within the stand out file I'm searching in excess of 65k database hits.

Help?

The approach I have seen used previously is:

  1. Bulk-load anyone's data right into a 'scratch' table within the database.
  2. Validate data within the scratch table using a single saved procedure (performing a number of queries), marking rows that fail validation, require update etc.
  3. Action the marked rows as appropriate.

This can be useful for validating missing posts, valid key values etc. It isn't so great for checking the format of person fields (don't make SQL pull strings apart).

As you may know, some folk feel uncomfortable putting business logic within the database, but this method does limit the amount of database hits the application makes, and eliminates holding all of the data in memory at the same time.

Your condition is extremely common in Data Warehouse systems, where bulk uploads and data cleansing really are a core area of the (regular) try to be achieved. It is best to google around ETL (Extract Transform Load), Staging tables and you will find an abundance of nutrients.

In broad response to your condition, should you choose 'load the information into memory' for checking, you are effectively re-applying an element of the DB engine in your code. Now that may be a positive thing whether it's faster and clever to do this. For example you might have only a little selection of valid dates for the Stand out extract, so you don't have to join to some table to check on that dates have been in range. However, for other data like foreign secrets etc, allow the DB do how it is proficient at.

Utilizing a staging table/database/server is a very common solution because the data volumes get large. BTW permitting customers to wash data in Stand out is a very wise decision, permitting these to 'accidentally' remove crucial data is a very bad idea. Are you able to lock cells/posts to avoid this, and/or place in some fundamental validation into Stand out. If your area ought to be filled and really should be considered a date, you should check that inside a couple of lines of stand out. Your customers is going to be happy because they do not have to upload before finding problems.

To reply to this correctly the next information could be helpful

  1. How's it going likely to inform the consumer of failures?
  2. Will one validation failure lead to loading 64,999 records or none?

first store inside a temp table from text file data using bulk uploading. then retrives this, and validate making use of your made interface. and after validation store it within the primary table or DB