using static site checker

Running ssc

To run ssc, bring up the command line, change to ssc’s installation directory, and type:

Windows ssc
Unix ./ssc

If it’s installed correctly, it should respond with © info.

It is possible to set up configuration files, but, to start, we’ll try a simple set of command line options. If the HTML files belonging to your website is located in the current directory, and the main index file is called index.html, then all you need type is:

ssc .

… and off it will go, reporting any issues it finds in all HTML files in the current directory and its descendants.

Reports

When ssc reports issues, it reports the HTML where it spots the problem, with the approximate line number. This may be after the actual location of the problem. Consider, for example:

<P>Taost<A href=“https://example.com/superdoopersecretrecipe.html” title=“what a wonderful page oh so wow” onclick=“halt-and-catch-fire” onhover=“wink-embarrassingly” whimper=frequently>d wombats are a delicious</A> thing.</P> riff on name

The word ssc spots is “Taostd”. It knows there’s a word to check when it gets to the end of the word, e.g. at the space that follows it. In this case, the word is interrupted by rather a lot of HTML, which is ignored when checking spelling. Because the intervening HTML is so long, crossing a number of lines, and ssc reports small snippets of HTML per issue (to avoid spouting too much verbiage), it will not show sufficient HTML to see the entire misspelt word. That’s why, usually, you should take the snippet and corresponding line number as a hint, not an absolute. There is an issue, it is near the reported HTML and line number, but, in this case, is not actually there. Of course, the report of the misspelt word mentions the word it considers misspelt.

arguments

ssc has a number of switches, including:

-h Output simple help
-V Output ssc’s version
-f f Load configuration from file f
-F Load configuration from ~/.ssc/config
-o f Output a report to file f
-v x Report level x issues, where x is
0 None
1 Catastrophes
2 Abhorrences
3 Errors
4 Warnings
5 Info
6 Comments
7 Debug
8… more and more debug
-I Process Server Side Includes
-g d set the root directory to d (defaults to current directory)
-x x treat files with extension x as HTML/XHTML
-i f file f is default (default index.html)
-s x x is the local site’s domain name
-L vd Define a virtual directory (vd is formatted ‘virtual=physical’)
-e Check external links (each link is checked once per run, not once per encounter)
-l Check internal links
-O Report each broken link once, not each time it’s found
-r Do not check for https certificate revocation
-X Check crosslinked IDs
-Y 1 Disable multithreading
-M Check microformats use of class attribute
-m Check ontologies use of WhatWG microdata attributes
-S Report site statistics
-A Report switches noticed by ssc, and exit.

If you find yourself having switch problems, add -A to your command line to see what switches ssc has noticed. -A kyboshes other processing.

A full list of switches is given by the -h switch, & can be found here.

Command line arguments can also be assembled in configuration files: more gen here.

Environment

ssc will process the following environment variables:

SSC_CONFIG name of configuration file if none given on the command line
SSC_ARGS command line switches

ssc will recognise when it is running in a CGI environment by processing QUERY_STRING and other CGI environment variables. In such circumstances, QUERY_STRING must contain a parameter called html.snippet, with an appropriate value.

riff on name

Potential gotchas

Versions

For HTML, ssc defaults to a version of the living standard that’s contemporary to the version of ssc. The same applies to other document types. In all cases, though, if the document itself specifies a particular version (such as HTML 3.2), then that applies.

The version indicated by the DOCTYPE can be very broad. For example, the HTML 5 doctype specifies HTML 5 alone, but not which flavour (W3 or living standard), nor which revision—there have been rather a lot of changes since the 2005 original and current version.

If you are verifying HTML, and it contains some CSS, MathML, SVG, or other content, then ssc will attempt to guess the version of the latter from the former, unless you say otherwise.

All this is why that, if you want to be precise about document verification, specify versions, either with a switch or an entry in a configuration file. If your HTML contains MathML/SVG/etc. snippets, then specify those versions too.

stressed systems

Unfortunately, some system suppliers sometimes release buggy compilers. These bugs can often be avoided by using an alternative compiler (for example, if using clang, try gcc, and vice versa), disabling multithreading at runtime (use -Y 1), or not upgrading your system until the supplier releases a less unreliable version of their noisy compiler.

More info

The ssc source root includes some text files with more information.

gen.text Command line man page
build.txt How to make your own copy of ssc
releasenotes.txt A history of features and featurettes
README.txt An introduction to ssc
LICENCE.txt Component licences
LICENSE.txt The GNU General Licence Version 3

Dylan Harris
December 2024