I am battling to produce a regex for Apache logs. The log format I am using is below.

Let me have the ability to match ANY word or phrase where "/city/index.html" is.

66-121-89-14.domain.com - - [14/Apr/2011:14:47:05 +0100] "GET /city/index.html HTTP/1.1" 200 2577 "http://www.domain.com/referrer/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16"

Can a regex ace help ?

--edit--

It's for ignoreregex on Fail2ban. I'd enjoy to have the ability to put something similar to /house to trap all files for the reason that directory or also /house/jonross.html particularly to complement exactly that HTML file. Many thanks.

If by "phrase" you mean "line" then that might be ^.*/city/index\.html.*$ in multiline mode.

/^.*\/city\/index.html.*$/g

This can match any line that contains the road /city/index.html

Sample at http://refiddle.com/10p

/"\w+ (.*?) HTTP\// will capture the Link to the request.

After several hrs of trial and hour, just in case it will help another person, this can ignore any directory following the slash after GET with one of these words:

ignoreregex = .*\"GET \/(city|house|anything).*

For .html the us dot needs steered clear of.