Sunday, March 28, 2010

Dealing with ignorable errors

In real-world applications assembled out of heterogeneous parts or interacting with systems beyond your control, error conditions are bound to occur. Often times, some of these errors can be safely ignored. For example, the qos.ch e-commerce site is crawled by googlebot which for some unknown reason insists on visiting invalid URLs about 30-50 times a day. These invalid requests cause the wicket server to throw exceptions of type WicketRuntimeException.

Given that Wicket relies on SLF4J for logging, the WicketRuntimeException is logged as an error. To keep abreast of errors in our e-commerce applications, we use SMTPAppender in our logging configuration. Thus, every event logged as an error triggers an email to our site's administrator. As you can imagine, receiving over 30 emails per day as a result of erroneous Googlebot submissions defeats the purpose of these log report sent by email. (After the 100th bogus email, the admin stops caring.)

Below is a configuration file which instructs SMTPAppedner to ignore errors generated by Googlebot.
<configuration scan="true">
  <statusListener class="ch.qos.logback.core.status.OnConsoleStatusListener" />
  <appender name="EMAIL" class="ch.qos.logback.classic.net.SMTPAppender">
    <layout class="ch.qos.logback.classic.html.HTMLLayout">
      <pattern>
        %date%-5level%thread%X{req.remoteHost}%logger%msg
      </pattern>
    </layout>
    <From>...</From>
    <SMTPHost>...</SMTPHost>
    <Subject>[${HOSTNAME}] %msg</Subject>
    <To>...</To>
    <evaluator class="ch.qos.logback.classic.boolex.JaninoEventEvaluator">
      <expression>
      level>=WARN
        &amp;&amp;
      !( mdc.get("req.userAgent") != null &amp;&amp; ((String) mdc.get("req.userAgent")).contains("Googlebot"))
      </expression>
    </evaluator>
  </appender>

  <root level="INFO">
    <appender-ref ref="EMAIL" />
  </root>
</configuration>

It relies heavily on JaninoEventEvaluator and requires the installation of MDCInsertingServletFilter.

You could also throw in markers into the mix as described in a previous post. It should be noted that the googlebot problem should be addressed by setting an internal error page but that's a different story.

No comments: