I've got a nice lovely Django site ready to go, but have observed that my error.log file was getting huge, over 150 Megabytes soon after several weeks to be live. Works out a lot of spambots are searching for well-known URL weaknesses (or something like that) and striking a lot of sub-sites like http://mysite.com/ie or http://mysite.com/~admin.php etc.

Since Django uses URL spinning, it's searching for templates to suit these demands, which boosts a TemplateDoesNotExist exception, after which a 500 message (Django performs this, not me). I've debug switched off, so that they only obtain the generic 500 message, but it is filling my logs very rapidly.

It is possible to method to turn this behavior off? Or possibly just block the IP's carrying this out?

Um, possibly, use logrotate to rotate and compress the logs periodically, whether it is not being carried out already.

"It is possible to method to turn this behavior off?" - the 500 is completely mandatory. The log entry can also be mandatory.

"Or possibly just block the IP's carrying this out?Inch - don't you want.

Everybody has this issue. Nearly everybody uses Apache log rotation. Everybody else either uses an OS rotation or comes their very own.

If you're able to look for a pattern in UserAgent string, you can utilize DISALLOWED_USER_AGENT setting. Mine is:

DISALLOWED_USER_AGENTS = (
    re.compile(r'Java'),
    re.compile(r'gigamega'),
    re.compile(r'litefinder'),
)

Begin to see the description in Django paperwork.

Django ought to be tossing a 404, not really a 500, when the URL does not match any records inside your URLConf.

http://paperwork.djangoproject.com/en/dev/subjects/http/web addresses/#handler404

You have to give a 404 template:

If you do not define your personal 404 view -- and just make use of the default, that is suggested -- you've still got one obligation: To produce a 404.html template within the cause of your template directory. The default 404 view uses that template for those 404 errors.

A programming solution is always to :

  • open the log file
  • browse the lines inside a buffer
  • replace the lines that match the errors the bots triggered
  • aim to the start of the file
  • write the brand new buffer
  • truncate the file to current pointer position
  • close

Voila ! It's done !

What about establishing a catch-all pattern because the last item inside your web addresses file and pointing it to some generic "no such page" as well as your home page? Quite simply, turn 500's into demands for the home page.

Why don't you fix individuals "bugs"? If your url pattern isn't matched up, a proper error message ought to be proven. With the addition of individuals templates you'll assist the user and yourself :-)

  1. Yes, it ought to be a 404, not really a 500. 500 signifies something is attempting to handle the URL and it is failing along the way. You have to find and connect that.

  2. You will find there's similar problem. Since we're running Apache/mod_python, I selected to cope with it inshtaccess with mod_rewrite rules. I periodically consider the logs and give a couple of designs to my "visit hell" list. All of these rewrite to provide a 1x1 pixel presen file. There's no tsunami of 404s to clutter up my log analysis also it puts minimal strain on Django and Apache.

You cannot make these a**holes disappear, so that all you should do is minimize their effect on the body and start your existence.