I must submit an application to some CGI script localy (w3c-markup-validator), but it's not fast enough using curl and apache, I wish to make use of this CGI script a lot more than 5,000 occasions within an another script. and presently it requires several hour.

What must i do to own form straight to the CGI script (I upload personal files with curl)?

edit: It appears to become too complicated and time intensive for which I desired, and so i anxiously waited one hour . 5, every time I desired to check my produced xhtml files. In definitive I did not test the solutions below, therefore the question will stay open.

With respect to the particulars from the script you may have the ability to produce a fake CGI atmosphere using HTTP::Request::AsCGI after which sourcing the CGI script using the "do" operator. However when it involves speed and maintainability your best choice is always to factor the key area of the script's work into its very own module, and rewrite the CGI like a client of this module. This way you do not have to invoke it as being a CGI -- the batch job you are speaking about now could be yet another program utilizing the same module to complete exactly the same work, but without CGI or even the webserver atmosphere getting in the manner.

OK, I checked out the origin code with this factor and it's not easy extract the validation stuff all the relaxation. So, here's what I'd.

First, ditch curl. Beginning a brand new process for every file you need to validate is not recommended. You will have to write a person script that can take a listing of URL's and submits these to the local server running on localhost. Actually, you may later wish to parallelize this since there will usually be a lot of httpd processes alive anyway. Well, I recieve in front of myself.

This script may use LWP because all you do is posting some data towards the CGI script on localhost and storing/processing results. You don't need full WWW::Mechanize functionality.

For the validator CGI script, you need to configure that like a mod_perl registry script. Make certain you preload all necessary libraries.

This will boost documents processed per second from 1.3 to some thing palatable.

CGI is a nice simple API. All it will is read data either from an atmosphere variable (for GET demands) or from stdin (for Publish demands). So you just need to complete is to setup the atmosphere and call the script. Begin to see the docs for details.

When the script uses CGI.pm, you are able to run it in the command line by delivering the '-debug' switch (to CGI.pm, within the use statement.) Which will then permit you to send the publish variables on stdin. You might want to tweak the script just a little to make this happen.