This test is in two parts: first the generated HTML is sanitized to remove any
potentially sensitive information (e.g. filenames, authors, document info, etc.)
and is then sent to the W3C Validator service at https://validator.w3.org/nu.
The results are interrogated and if any errors or warnings are returned, the
test fails. If the site cannot be reached this is NOT treated as a test failure.
Second, the actual (unsanitized) filenames are checked for validity: the HTML
standard prohibits backslashes in URLs, even if the URL refers to a local file
on a system that uses backslashes as a path separator (e.g. Windows). This
would have been caught by the W3C Validator if we had not sanitized the filenames.