static site checker

content

introduction
try
why
download
README
usage
known issues
build
source
boot notes
copyright & licence


introduction

The static site checker is an opinionated HTML nitpicker, a command–line tool to validate static HTML & XHTML websites. I built it to nitpick my hand–coded identity website. I’m making it available should others find it useful.

It should not be used on untrusted content; its parsers are holier than Robin’s cow.

Dylan Harris
January 2022


try







        

notes

SSC is a site checker, not a page checker, yet here you can only input a page, at most. It is not the best possible illustration of the program’s abilities, but it is better than nothing. Testing a web site would require proof of control of that site, and I’m not going there for a simple demo.

SSC is pre–alpha software. It has more errors than Fido has fleas. This page is a demo of its potential, that’s all. Do not presume reported issues are correct. Do not presume unreported issues aren’t issues. If you’re unsure, check against the appropriate standard. There is no guarantee of anything, let alone accuracy.

There is a fairly tight limit on the size of a snippet.


why ssc

Why did I make the static site checker? Aren’t there a lot of other HTML validators around? Well, first of all, I’ve not found a website validator, only web page validators. Perhaps I didn’t search sufficiently.

I have a fairly big website, with more than 100,000 pages. I’m too impatient to run each through a validator individually; I want to validate my site as a whole. Some errors occur between pages, not specifically on pages: not just missing links, but, for example, it’s invalid to link to an otherwise valid id on another page when that id’s element, or one of its ancestors, is HIDDEN.

The validators I did find were incomplete. Now, admittedly, I checked these out a few years ago, and some may well have got better. Editor based validators are certainly very useful, but they only work on individual pages, not sites. You have to edit a page to get it validated, and if you have rather a lot of them, that’s a lot of time wasted opening and closing individual pages. To expand that ID example above, if, in a specialist editor, you add HIDDEN attribute to an element which has an id on a child element, does that editor then name & shame the other page you’ve just invalidated?

All this are part of the reason why many people use frameworks. (Another, the obvious one, is to get a site up quickly.) One of the difficulties I have with frameworks, is that, most of all, so many web frameworks are, visually speaking, boring and trite. The visual arts world has had centuries to work out excellent form and vision to fit in a rectangular space, and it seems to me the modern web hasn’t noticed. The best that can be said is that some of them have approached the advances made in the 14th century, and that’s just in the Western artistic tradition. So much more is possible, yet it hasn’t happened. I want to break free from this dull, stultifying conservatism.

It may be that I’m making the wrong comparison, that the web isn’t about image, it’s about type. The comparison should not be with pictures, but papers. There’s certainly something to that. The Western visual high arts never did really suss mixing writing and form (actually, that’s not really true, but, IMHO, such arts never broke out of their context). But arts from Japan, for example, certainly did, and the web doesn’t seem to have noticed them either.

Also, to be absolutely fair, there are experimental websites mixing imaginary and text rather well. But then we get back to my point about the 14th century. Those I’ve seen, and I’ve certainly not seen as many as there are, nor come close to it, those I’ve seen still seem not to have noticed the visual forms changes made since the middle ages.

Anyway, enough of this. Rather than criticising other people for not doing, I should do. I should make my point, not by criticising others for not thinking of it, but by example. I need to knock up some example sites. That’s where SSC comes in.

You see, if I am to build a site using what is effectively an experimental visual process, I can’t use existing web site design frameworks. But if I can’t use a framework, I have to hand code everything. And there’s a key problem: HTML is such a convoluted, evolved mess, that the people who design it, in their own design presentations, make errors. Ok, I only found this out by testing SSC on them, which perhaps illustrates my point about things being overcomplicated. Anyway, I’m not going to reveal any names because these people are actually working hard to make the web a better place. Let’s just say W3 has broken links, WhatWG references withdrawn standards, and many other authors’ sites have other internal inconsistencies. I must mention that my HTML code is far worse than any of these mild examples of technical naughtiness. But the fact that the people who define the web make mistakes in the usage of their design in the documents that espouse the design, does rather explain why most other people are forced to use dull, formulaic, archaic, boring, tools.

I’ve not yet built a site inspired by the visual art world’s lessons in form and layout. My efforts have been spent in building the tool to make that possible. But now, I contend, it is at least a little more possible than it was.

Since I’m here, I’ll list other issues I have with frameworks:

  1. They have to be regularly maintained. Every time an update comes out, that update has to be applied to a site, or, alternatively, the update is ignored and the site becomes vulnerable to exploits blocked by the update, and published when the update is released. This is time lost.
  2. Updates for frameworks don’t always work. Instead of fixing issues, they break the site. This why I dropped my experimental Drupal site a few years ago. This is why I stopped using NextCloud. (although NitroKey’s NextBox resolves that particular problem.)
  3. Frameworks are usually written in scripted languages, such as PHP. Scripts are unavoidably insecure compared to no scripts: a script cannot be hacked if it does not exist. Thus a site with no scripts is inherently more secure than a site with scripts. For example, there are, as I write, five known vulnerabilities in PHP, a very popular server scripting language (better than it once was, admittedly). If you use PHP, your site, in principle, has those vulnerabilities, and you have to spend time mitigating them. If you do not use PHP, your site cannot have those vulnerabilities. This is why my sites have no scripts, and another reason why SSC does not analyse scripts (the main one being lack of time). I do accept that sophisticated sites have no choice but to use scripts, but I suggest many sites use them unnecessarily.
  4. The worst of them all, for me, is that some frameworks, and many scripts, pull in code residing on other sites, in third–party repositories and the like, as the script is run. If you do that, this means the integrity and security of your site is entirely dependent on the security of the repository. There are unfortunately many examples of repositories being hacked, and, in consequence, all the site that used those repositories are broken in turn.

Dylan Harris
October 2021


download

linux (amd64) : centos 8 / ubuntu 20.04
macos (intel/amd64) : catalina / big sur / monterey
openbsd (amd64) : 6.8 / 6.9 / 7.0
windows 10 : x86 / x64
source : gzip

notes

These downloads are NOT signed (I’ve not got my act together). If that concerns you, grab the source and compile SSC yourself. Source is also available here, below, & on Github.

It may be necessary to install boost, icu4c libiconv & hunspell (including dictionaries) for the binaries to work. It is normally sufficient to install them using your operating system’s standard package manager.

The specifications used to make SSC were acquired from various public websites. To avoid confusion when discussing details, here is a humungous collection of the documents.


README

Static Site Checker
(an opinionated HTML nitpicker)
version 0.0.122
https://ssc.lu/



(c) 2020-2022 dylan harris
see LICENCE.txt for copyright & licence notice
see W3-LICENCE.txt for additional copyright & licence information



WARNING: this code is:
- incomplete
- pre-alpha
- IT PROBABLY WON'T BEHAVE AS YOU EXPECT :-)
- do NOT feed it untrusted data



ssc analyses static HTML snippets, files and sites:
- HTML 1.0/+/2.0/3.0/3.2/4.00/4.01/5.0/5.1/5.2/5.3-draft
- HTML living standard, Jan 2005 to Jan 2022
- SVG 1.0/1.1/1.2 Tiny/1.2 Full/2.0/2.x draft Apr 2021
- MathML 1/2/3/4-draft
- XHTML 1.0/1.1/2.0/5.x
- finds broken links (requires curl)
- processes server side includes, mostly
- analyses microdata & RDFa
- analyses RDFa standard context ontologies & others

with opinions on:
- standard english where dialect is required
- perfectly legal but sloppy HTML
- abhorrent rudeness such as autoplay on videos
- dubious spelling

It does NOT:
- behave securely: its parser is holier than robin's cow
- analyse or understand scripts
- analyse or understand styles, beyond nicking class names from CSS
- analyse or understand XML or derivatives except as noted above

It can output:
- 'repaired' HTML (not XHTML)
- HTML with resolved Server Side Includes
- JSON summaries of microformat and microdata content
- website statistical information
- updated website with datafile deduplication


ssc -h
for a usage summary.

ssc -f config_file
analyse site using preprepared configuration

ssc directory
analyse website based in directory



To build & run:
1. Follow the build instructions in build.txt
2. Gleefully run ssc. It will misbehave if you are insufficiently gleeful.



NOTE
SSC can be run in a CGI environment. This is intended for use with OpenBSD's native httpd web server
(https://man.openbsd.org/httpd.8). You are reminded that SSC is pre-alpha software. Do NOT expose it
to untrusted data sources, such as the open web, without taking serious precautions. SSC probably has
more bugs than the Creator's Ultimate All-Beetle Extravaganza (J.B.S. Haldane, apocryphal : "[the
Creator has] an inordinate fondness for beetles.").



Notes on names:
- recipe: a nod to Vernor Vinge's "A Fire Upon the Deep"
- tea: without tea, nothing works; then there's builders' tea
- sauce: identifies those who presume; and anyway, it's obvious
- toast: toasts code; i like burnt toast
- heater: i'm not stopping now
- unii: my preferred plural of unix; both unixes and unices sound like they sing castrato




SEE ALSO
build.txt        notes on building ssc
gen.txt          a model man page
usage.txt        how to use ssc
releasenotes.txt a slight history of releases
LICENCE.txt      ssc licence information
LICENSE.txt      formal GPL 3 licence
more licences    licences for borrowed external content



written by dylan harris
mail@ssc.lu
January 2022

usage

NAME
ssc - analyse static web site source



SYNOPSIS
ssc [...] directory
ssc -f config
ssc



DESCRIPTION
ssc (the Static Site Checker) is an opinionated HTML nit-picker, intended for
people, such as its author, who hand code websites. It doesn't just check
static websites for broken links, dubious syntax, and bad semantic data, it
will actively complain about things that are perfectly legal but just a little
bit untidy, such as its author.

Except when serving CGI queries, it recursively scans the directory looking
for HTML source files to analyse. It produces a list of errors, warnings,
comments, and other hints of imperfection. Once complete, it summarises
internal site inconsistencies, and can produce some simple statistics.

ssc ignores scripts.



COMMAND LINE ONLY SWITCHES

These options are only available on the command line:

-f file                 Load configuration from file, which should be in .INI
                        file format. See CONFIGURATION FILE FORMAT below.

-F                      Load the configuration file .ssc/config in the current
                        directory.

-h                      Show a summary of switches and exit.

-V                      Show version details and exit.

--validation            Show attribute extensions and exits. Attribute
                        extensions are additional values that can be
                        associated with attributes on many X/HTML elements.



COMMAND LINE AND CONFIGURATION FILES SWITCHES

These options are available on the command line and in configuration files:

--corpus.article        Prefer the content of 
when gathering corpus text. --corpus.body Prefer the content of when gathering corpus text. This is the default. --corpus.main Prefer the content of
when gathering corpus text. --corpus.output file Dump XML corpus of site into file. This is intended for use by a local search engine. If none of --corpus.article, --corpus.body, or --corpus.main are specified, the content of is used. If more than one are specified, then the text collected depends on a page's content. This is incompatible with --shadow.update. --general.css Do NOT process .css files. --general.custom EL Define a custom element for verifying the IS attribute. May be repeated. --general.datapath dir Look for any configuration, caches, and other useful -p dir files, in this directory. --general.error x If nits of the specified category or worse are -E generated, then exit with an error code. Values are: 'catastrophe', 'error' (the default), 'warning', 'info', or 'comment'. --general.ignored EL ignore attributes and content of the element . May be repeated. --general.lang LA If an X/HTML file does not have a language / dialect specified (e.g. "en" for generic English, "en_IE" for Irish English, "lb_LU" for Luxembourgish, etc.), default to 'LA'. If not given, the default is your system default, or, if none, then "en_US" (standard American English). --general.maxfilesize n Do not process HTML source files that exceed n bytes in size (default: 4M). Specify 0 for unlimited, although be warned that ssc is stunningly stupid in such circumstances and may even attempt to load files that exceed available memory. --general.output Output to the specified file. If this switch is not -o file used, standard output is used. --general.nochange Report what ssc would do, but don't do it. -n --general.progress Dump progress information to standard output. This can -D interfere with formatted output. --general.rdf Check RDF attributes. This option currently underperforms. An extension to properly support RDF and RDFa is en route. --general.rel Only mention REL values, found neither in the living standard nor at microformats.org, in debug output. --general.slob Ignore perfectly legal but inefficient, indeed thoroughly slobby, HTML, such as being far too lazy to get round to bothering to close elements. --general.ssi Process Server Side Includes. Although ssc can process -I many server side includes, it cannot process those containing formulae. Note that processing SSIs may cause incorrect line numbers to be mentioned when an issue is described. --general.verbose x Output nits to the specified verbosity: 'catastrophe', -v 'error', 'warning', 'info', 'comment' (the default), or '0' for silence. Additional values are available when debugging. Each level includes its preceding level, so, for example, 'warning' will also output 'catastrophe' and 'error' nits. --html.rfc1867 Ignore the RFC 1867 (INPUT=FILE) extension when processing HTML 2.0 --html.rfc1942 Ignore the RFC 1942 (tables) extension when processing HTML 2.0. --html.rfc19802 Ignore the RFC 1980 (client side image maps) extension when processing HTML 2.0. --html.rfc2070 Ignore the RFC 2070 (internationalisation) extension when processing HTML 2.0. --html.tags When an HTML file is loaded that contains no DOCTYPE, ssc normally presumes it's an HTML 1 file. This switch tells it to presume the file follows an earlier HTML Tags specification (the one at CERN). This is overridden by --html.version. --html.title n If text is longer than n characters, say so. -z n This applies to child text of a header <TITLE> element, not the value of TITLE attributes. --html.version X If no doctype (or xml header) is specified, presume version X of HTML. X can be: tags HTML tags (1991, informal), 1.0 HTML 1.0 (June 1993 draft), + HTML Plus (November 1993 draft), 2.0 HTML 2.0 (RFC 1860), 3.0 HTML 3.0 (March 1995 draft), 3.2 HTML 3.2, 4.0 HTML 4.0, 4.1 HTML 4.01, 4.2 XHTML 1.0, 4.3 XHTML 1.1 core, 4.4 XHTML 2.0 (December 2010 draft), 5.0 W3 HTML 5.0, 5.1 W3 HTML 5.1, 5.2 W3 HTML 5.2, 5.3 W3 HTML 5.3 (October 2018 draft), 2005/1/1 WhatWG WebApps draft (January 2005), ... 2007/1/1 WhatWG WebApps draft (January 2007), 2007/7/1 WhatWG HTML 5 (July 2007), ... 2022/1/1 WhatWG HTML 5 (October 2021), XHTML 1.0 XHTML 1.0, XHTML 1.1 XHTML 1.1 core, XHTML 2.0 (December 2010 draft), XHTML 5.x XHTML corresponding to equivalent W3 HTML. Although you can specify exact dates for versions of the WhatWG HTML 5 living standard, currently only broad versions published in January and July are supported (quaterly from April 2021). It is expected that, as the standard develops, more precision will be applied to changes in ssc analysis. Certain versions of HTML offer variants, such as loose and strict definitions. ssc picks those up from the <!DOCTYPE ...> in the HTML file, if any, and carelessly ignores them. Validation of XHTML is not strict. Just to remind you, there are no guarantees of accuracy (or inaccuracy). Copies of the appropriate standards can be found online at source. --link.301 Normally, when ssc checks external links -3 (--link.external), it does not report http forwarding errors 301 and 308. Use this switch to have it do so. --link.external Check external links, e.g. those not on the site being -e checked. This requires a copy of curl on the path. Note that, no matter what the switch, ssc will NOT check certain special site names, such as example.com. --link.internal Check internal links, e.g. those within the website -l being checked. --link.once Only report each broken external link once. If, for -O example, the site has a number of references to a page that does not exist, ssc will only report the first instance of the broken link. Note that, even if it reports every occurrence of the link, it will only check it the first time it encounters it (requires --link.external). --link.revoke Do not check whether https links' certificates have -r been revoked (requires --link.external). --link.xlink Check crosslink IDs on the site being analysed. For -X example, if a link goes to /index.html#id, then, when this switch is set, ssc will verify that the id exists and that it is not hidden. --math.version Presume this version of MathML (1, 2 or 3). The following versions are supported: 0 work it out from the (HTML) version of the file being analysed, 1 MathML 1, 2 MathML 2, 3 MathML 3, 4 MathML 4 (December 2020 draft). --microdata.verify Check microdata found in WhatWG microdata attributes -m (itemprop, itemtype, etc.). Note that ssc only knows about certain ontologies. Find out which with --ontology.list --microdata.export Export schema.org microdata encountered. This data is exported in JSON format (not JSON-LD). --microdata.root DIR When exporting microdata with --microdata.export, write files into the directory DIR. ssc will create the directory tree structure as appropriate. --microdata.virtual v=d When exporting microdata using --microdata.export, export the contents of virtual directory 'v' to 'd'. 'v' must match a directory identified with --site.virtual. For example: --microdata.virtual virtual=X:\virtual. --microformat.verify Verify Microformats data in class and rel attributes -M (see https://microformats.org/). --microformat.export Export microformat data encountered in JSON format. This option will write files in the same directory as the source, with the extension .json. --microformat.version x Presume microformats version x. The following values are current accepted: 1 microformats version 1 only, 2 microformats version 2 only, 3 both microformats versions 1 and 2. --nits.catastrophe n redefine nit n as a catastrophe; may be repeated (the value of n can be determined using --nits.nids below). --nits.codes Output nit codes. --nits.comment n Redefine nit n as a comment; may be repeated (the value of n can be determined using --nits.nids). --nits.debug n Redefine nit n as a debug message; may be repeated (the value of n can be determined using --nits.nids). --nits.error n Redefine nit n as an error; may be repeated (the value of n can be determined using --nits.nids). --nits.format F Specify the output format; F is a template file (see OUTPUT TEMPLATE below). --nits.info n Redefine nit n as information; may be repeated (the value of n can be determined using --nits.nids). --nits.nids Output nit ids, which can be used to redefine nits. --nits.quote X Specify quote style when using nit.format. X can be one of 'text' or 'html'. --nits.silence n Silence nit n; may be repeated (the value of n can be determined using --nits.nids). --nits.warning n Redefine nit n as a warning; may be repeated (the value of n can be determined using --nits.nids). --ontology.list List known schema versions. See --schema.version --ontology.ONT X.Y Presume version X.Y of ontology ONT. For example: --ontology.xsd 1.1 defaults usage of XSD to version 1.1. The versions apply to RDFa, microdata, and microformats (using class) analysis. If .Y is omitted, .0 is presumed. X must be present. Unspecified defaults are derived from the HTML version. For a list of possible values, use --ontology.list. At the time of writing, the following ontology versions can be verified. Note that single version ontologies cannot have their version changed: article 12,14,18,22 as 1.0,2.0 bibo 1.3 book 12,14,18,22 cc 1.0 content 1.0 csvw 1.0 ctag 1.0 daq 1.0 dbp 1.0 dbp-owl 1.0 dbr 1.0 dc11 1.0,1.1 dcam 1.0 dcat 1.0,2.0 dcmi 1.0 dcterms 1.0,1.1 doap 1.0 dqv 1.0 describedby 1.0 duv 1.0 earl 1.0 event 1.0 foaf 0.1-0.99 frbr_core 1.0 gr 1.0 grddl 1.0 ical 1.0 icaltzd 1.0 jsonld 1.0,1.1 ldp 1.0 license 1.0 locn 1.0 ma 1.0 mf 1.0-2.255 music 12,14,18,22 oa 1.0 odrl 1.0 og 10,12,14,18,22 org 1.0 owl 1.0,2.0 poetry 1.0 profile 12,14,18,22 prov 1.0 ptr 1.0 qb 1.0 rdf 1.0-1.3 rdfa 1.0-1.3 rdfg 1.0 rdfs 1.0 rev 1.0 rif 1.0 role 1.0 rr 1.0 schema 2.0-13.0 sd 1.0 sioc 1.0 sioc_s 1.0 sioc_t 1.0 skos 1.0 skosxl 1.0 sosa 1.0 ssn 1.0 taxo 1.0 time 1.0 v 1.0 vann 1.0,1.1 vcard 1,2,3,4 video 12,14,18,22 void 1.0 wdr 1.0 wdrs 1.0 website 12,14,18,22 wwg 1.0 xhv 1.0 xml 1.0 xsd 1.0,1.1 vCard versions correspond to RDFa specs, published in 2001, 2006, 2010 & 2014. They do NOT correspond to vCard data format specifications. Open Graph versions correspond to snapshots of the specs from 2010, 2012, 2014, 2018 & 2022. --shadow.changed When shadowing a site that has been previously shadowed, only copy/link files that have changed. --shadow.comment Do not delete comments when writing shadow pages. --shadow.copy X Create a shadow directory structure from source HTML files, with errors removed and some things tidied up. X can be: no copy nothing (default); pages write 'fixed' source files, ignore non source files; hard set up hard links to non-source files (requires source and shadow directories to be on the same disk); soft set up soft links to non-source files; all copy non HTML files too; dedu copy non HTML files too, but deduplicate them, changing links in HTML source if necessary; report report duplicates (no shadowing). ssc cannot convert between versions of HTML, nor between HTML and XHTML. The soft and hard link options are only available on systems that support them. --shadow.enable Enable shadowing (set by other shadow options). If shadowing is enabled, but shadow.root is not set, SSC will litter the site source directories with .ndx files. --shadow.file f Write ssc's shadow cache to file f, to accelerate future shadowing of the same content. --shadow.ignore ext When shadowing, ignore files with this extension (may be repeated). --shadow.info Add a comment at or near the top of each shadowed HTML file noting its generation time. --shadow.msg text Insert a comment containing the text at the top of every generated page. Note that, if any SSI included file is updated, the comment will appear whether or not the original page is updated. --shadow.root dir Where to write the shadowed site. --shadow.ssi Do NOT resolve SSIs when shadowing, even if --general.ssi is set. --shadow.space Leave excess/repeated spaces and blank lines in the shadowed files untidily untouched. --shadow.update Only examine files that have changed since the last -u time ssc ran. This is incompatible with --corpus.file. This requires --shadow.file. Nits of files that have not changed will not be reported. --shadow.virtual v=d When shadowing virtual directories, output the shadow of virtual directory 'v' to directory 'd'. 'v' must match a directory set up using --site.virtual. --site.domain domain The domain name of the site is 'domain'. This can be -S domain repeated. This is used to identify any URL that is apparently external but is actually internal to the site. --site.extension ext Treat files with this extension as X/HTML source -x ext files. This may be repeated. Files with extension .html are always checked. --site.index file This is the name of the index file in a directory. -i file This can be repeated. This is used for checking internal links. --site.root dir This is the root of the website to analyse. ssc will -g dir recursively scan the directory analysing any HTML files it finds. The default is the current directory. --site.virtual v=d The HTML virtual directory 'v' is located in actual -L v=d directory 'd' on the local filesystem. For example: --site.virtual virtual=D:\actual --spell.accept XXX XXX is a correct spelling or a word (or a list of words) in all languages. --spell.check Check text spelling. Uses external spelling checkers, so results may be inconsistent between systems. --spell.dict LANG,DICT Unix only. Associate dictionary DICT with LANG. For example, if the standard English dictionary is en_GB-large: --spell.dict en-GB,en_GB-large (Under Windows, ssc uses the OS dictionaries.) --spell.list FN,LANG The file FN contains a list of valid spellings for language LANG (which may include country info). If LANG is omitted, the valid spellings apply to all languages. For example: --spell.list villages.txt,en-IE --spell.list dorfer.txt,de --spell.list letzstied.txt --spell.path PATH Unix only. Path to spelling executable. Hunspell or a compatible program is expected. If none is specified, ssc will seek hunspell. (Under Windows, ssc uses the OS spellchecker.) --stats.meta Produce statistics on <META> usage in <HEAD>. Note that pragmas reported (http-equiv) are those found in the HTML source, not those returned by the HTTP protocol. Remember that many web servers (not all) will remove some pragmas when serving pages. --stats.page Produce statistics for each source file encountered. --stats.summary Produce a summary of overall statistics for the website. --svg.version x Presume any SVG code encountered is this version, unless the SVG code itself specifies a version. Versions recognised: 1.0, 1.1, 1.2 (really 1.2/tiny), 1.2/tiny, 1.2/full (May 2004 draft, incomplete, any conflict with tiny always resolved in favour of tiny), 2.0, 2.1 (april 2021 draft). If this switch is not used, and some SVG code does not identify its version, the version is derived from the version of the host X/HTML code. --validation.minor x When validating W3 HTML 5 source code, using this -m x minor version of W3 HTML 5. Valid values are 0, 1, 2, and 3. WhatWG versions are determined by date, corresponding roughly to the date of the (online) publication of the specific version. See the --html.version switch. --validation.microdata Validate (schema.org) microdata. --validation.* Add a permitted value to a particular HTML enumeration. Can be repeated. Extendable enumerations include charset, class (valid values may also be picked up from CSS files), colour, currency, http-equiv, lang, metaname, mimetype, rel, SGML, and many others. A full set of possible enumerations can be listed using the --validation switch. CONFIGURATION FILE FORMAT If a configuration file is used, it should be in INI file format. All content is optional. Section and option names are derived from the long form switch name, which consists of SECTION.OPTION, laid out in the format: [SECTION] OPTION=yes OPTION=123456 Switches that do not have a long form version cannot be used in a configuration file. Each ssc tests (in the toast folder) has a configuration file; browse them for examples. ENVIRONMENT QUERY_STRING Run under OpenBSD's httpd server. See notes below. SSC_CONFIG If no configuration file is given on the command line, use this one SSC_ARGS Preliminary command line parameters If, when SSC is run, the environment variable QUERY_STRING is set to an OpenBSD httpd server CGI value that includes the parameter html.snippet, then SSC will nitpick that snippet only. Some other parameters are processed, including general.verbose and html.version. EXIT STATUS If no significant nits are found, ssc exits with 0, otherwise it exits with a value > 0. OUTPUT TEMPLATE Warning: output templates is work in progress, and may be subject to significant breaking change in future versions of ssc. The --nit.format switch allows control of output format. It takes a file name. The format of that text file is a sequence of fixed section names, enclosed in square brackets on their own lines, each optionally followed by text. In that text, certain specific identifiers, enclosed in brace pairs, are substituted. For example: [dog-section] My pet dog {{dog-name}} is a {{bad-dog}}. For examples, browse toast/output/*.nit If no file is specified, or if the file cannot be loaded, a default template is used. Note also the --nit.quote switch. EXAMPLES To verify the version of ssc: ssc -V To check the static web side source directory /home/site/wwwroot: ssc /home/site/wwwroot To check a static website for example.com, in the current directory, that uses server side includes, including verification of external links, with very verbose output: ssc -e -I -x html -x shtml -s example.com -v 5 -i index.shtml To check a static web side in the current directory, with a virtual directory, verifying microformats: ssc -L vitual=/home/site/virtual -M To check a static web site using a configuration file: ssc -f config.file A simple configuration file might contain: [general] verbose=4 output=simple.out [site] domain=example.edu extension=html index=index.html root=simple A configuration file to check a site against HTML 5.2 and SVG 1.1 might contain: [general] output=site.out class=yes [link] check=yes [site] domain=example.edu extension=html index=index.html root=site [html] version=5.2 [svg] version=1.1 A configuration file to check against a particular WhatWG living standard, gathering statistics: [general] output=jan21.out [html] version=2021/01/01 [link] check=yes [microdata] version=11.0 [site] domain=example.edu extension=html index=index.html root=site [stats] summary=yes meta=yes A configuration file to shadow copy and deduplicate a site might contain: [general] output=dedu.out class=yes [site] domain=example.edu extension=html index=index.html root=site [shadow] copy=5 root=shadow file=dedu.ndx A configuration file to export microdata preparing against schema.org version 7.2 might contain: [general] output=export.out class=yes [site] domain=example.edu extension=html index=index.html root=site [link] check=yes [microdata] export=yes root=export version=7.2 PREPARING and UPDATING a SITE These files are based on the steps I take to update an OpenBSD website. Presume a directory containing the following: site.conf ssc configuration file for a website site shadow output produced by ssc Then I run a script like this: ssc -f site.conf upload.sh site /var/www/site-upload server user 0 ssh user@server "cd /var/www ; mv site x ; mv site-upload site ; mv x site-upload ; ln -sf site htdocs" upload.sh is a macos bash script that can be found among the source code. Note that I have rather naughtily replaced OpenBSD's httpd document directory /var/www/htdocs with a link. Here is site.conf: [general] verbose=info class=yes output=site.out ssi=yes ignore=pre rpt=yes [html] version=2021/04/01 [link] check=yes xlink=yes [microformat] verify=yes [site] domain=example.com extension=html extension=shtml index=index.shtml root=corrupt_source [stats] summary=yes [shadow] copy=dedu root=site file=site.ndx ignore=inc info=yes SEE ALSO tidy linkchecker HISTORY ssc is written by Dylan Harris, https://ssc.lu/. </pre> </section> <HR> <section id="issues"> <h2>known issues</h2> <p> SSC is pre–alpha software. It doesn’t do what it’s supposed to do, and what it’s supposed to do is wrong. </p> <ul><li> SSC is built based on my understanding of various standards. My understanding is certainly wrong. I will have misread some text, and misunderstood what I read correctly; </li><li> I had to make a number of compromises when building the code. Quite a lot of the checks are incomplete or even entirely missing; </li><li> I put my emphasis on standards that are actively followed, and put little effort, beyond the basics, into those that were never properly implemented; </li><li> I built an evolving product, reflecting evolving standards: biological evolution made the dodo, linguistic evolution made ‘hippopotamus’<SUP>*</SUP>, I made SSC; </li><li> The code was built to get something working. In many places, it is horrible. A great deal of refactoring could take place, had I the time. </li><li> No attempt was made to write secure code. It should only be run on trusted data. In particular, a great weakness of much software is the parser, and the parsers in SSC were handmade using hopeless optimism and bizarre ideas. </li><li> No attempt was made to write multi–threading code (I consider SSC disk bound, and never mind that disks have evolved to SSDs since I started the project); </li><li> The tests are incomplete. Emphasis is placed on HTML 5, but even those tests suffer from missing content. </li></ul> <p> Note that <a href="https://github.com/devongarde/">github</a> hosts a list of <a href="https://github.com/devongarde/ssc/issues">known issues</a>. </p> <P><SUP>*</SUP> How can such a dangerous animal have such a cuddly name? It’s like calling the Hound of Hell ‘<a href="https://harrypotter.fandom.com/wiki/Fluffy">Fluffy</A>’. </P> </section> <HR> <section id="build"> <h2>build</h2> <pre> BUILD NOTES static site checker https://ssc.lu/ (c) 2020-2022 Dylan Harris Introduction ============ SSC can be built from various unii command lines using CMake, or with Visual Studios 2017 / 2019 / 2022 under Windows. Libraries ========= Before you can build SSC, you may find you need to install and build boost version 1.75 or better, a recent version of the ICU libraries, a copy of Microsoft's GSL library. Most unii have all available as packages. You may need to set these environment variables: BOOST_ROOT to point to the boost source root directory (https://boost.org); you may also need to set BOOST_LIBRARYDIR and BOOST_INCLUDEDIR appropriately; GSL_ROOT to point to the GSL root directory (https://github.com/Microsoft/GSL). ICU_ROOT to point to the ICU library source root directory (https://icu-project.org/); SSCPATH to point to the ssc source directory (https://ssc.lu/). Unii & mock Unii ---------------- Building SSC under unix, including macos, requires a development installation of hunspell (https://hunspell.github.io/), & these environment variables defined: HUNSPELL_INCLUDE to point to the hunspell include directory HUNSPELL_LIB to point to the hunspell library directory HUNSPELL_VERSION, the actual library name (such as "hunspell-1.7.so") Once you've got them, navigate to recipe/tea, and run cmake. Windows ------- The Windows build uses the native Windows spellchecker, so you do not need hunspell. Building ======== Windows ------- To build from Visual Studio, navigate to recipe/tea, open the appropriate .sln file, then build. Only Visual Studios 2017 / 2019 / 2022, 64 bit, have been built & tested, for Windows 8.1 & 10. If the 32 bit version builds, it will generate oodles of annoying warnings, and the executable won't be able to analyse larger sites. Unii & mock Unii ---------------- You will need CMake 3.12 or better. From the home ssc directory, compile thus: cd recipe/tea cmake . make ctest make install If everything works correctly, then everything will be built, a series of tests run, with a final result at the very end saying no failures. Having said that, given SSC is pre-alpha, don't be too surprised to see some warnings or some final test errors. Note in particular that complaints about being unable to find or copy files during testing are not of concern, these come from scripts that set up or tear down individual tests, and the standard commands used sometimes complain if they can't find files they're supposed to delete, rather than saying thank you for reducing their work. The following have successfully built, although not always under all versions of ssc: Linux: Centos 8 amd64, Ubuntu Server 20.04/20.10 amd64 OpenBSD 6.8 / 6.9 / 7.0, amd64 MacOS: Monterey, Big Sur, Catalina, Mojave, & High Sierra (all intel x64) Note: Use clang if possible, gcc takes a wee while to build. OpenBSD ------- I've only tested the amd64 build under 6.8 / 6.9 / 7.0. The versions of boost and cmake in packages are sufficient. You will need to increase significantly the available memory setting in login.conf for the build account, if you have not done so already. Openbsd 6.8 offers hunspell 1.6, so if you use that version, you will need to set the HUNSPELL_VERSION environment variable appropriately. </pre> <h3>notes</h3> <p> If everything works correctly, then everything will be built, a series of tests run, with a final result at the very end saying no failures. Having said that, given SSC is pre–alpha, don’t be too surprised to see some warnings or some final test errors. </p> </section> <HR> <section id="sauce"> <h2>source</h2> <h3>0.0.122</h3> <ul> <li> Added spelling checks & spell.xxx switches (requires hunspell on unix) </li> <li> Changed behaviour of binary switches (args are now processed, not presumed) </li> <li> A number of features are enabled by default </li> <li> underlying work / various refinements </li> <li> <A href="src/ssc-v122-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2022-01-16">22 1 16</time> </li> </ul> <h3>0.0.121</h3> <ul> <li> Living Standard Jan 2022 (very similar to October 2021) </li> <li> Drop 32 bit builds & macos before catalina </li> <li> underlying work </li> <li> <A href="src/ssc-v121-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2022-01-05">22 1 5</time> </li> </ul> <h3>0.0.120</h3> <ul> <li> RDFa </li> <li> added --ontology.list to list known ontology schema </li> <li> added --ontology.ONT x.y to set the default version of ontology ONT </li> <li> <A href="src/ssc-v120-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2022-01-02">22 1 2</time> </li> </ul> <h3>0.0.119</h3> <ul> <li> macos Monterey </li> <li> underlying work / various refinements </li> <li> <A href="src/ssc-v119-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-12-21">21 12 21</time> </li> </ul> <h3>0.0.118</h3> <ul> <li> Visual Studio 2022 solution </li> <li> underlying work / various refinements </li> <li> <A href="src/ssc-v118-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-11-22">21 11 22</time> </li> </ul> <h3>0.0.117</h3> <ul> <li> underlying work / various refinements </li> <li> <A href="src/ssc-v117-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-11-10">21 11 10</time> </li> </ul> <h3>0.0.116</h3> <ul> <li> change default HTML to living standard Oct 2021 </li> <li> <a href="https://openbsd.org/">OpenBSD</a> 6.9 / 7.0 </li> <li> underlying work / various refinements </li> <li> <A href="src/ssc-v116-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-10-24">21 10 24</time> </li> </ul> <h3>0.0.115</h3> <ul> <li> set unii installation directory to ~/bin </li> <li> added experimental solution for Visual Studio 2022 preview </li> <li> XHTML role attribute (https://www.w3.org/TR/xhtml-role/) </li> <li> now requires boost 1.75 or better </li> <li> unii builds now require CMake 3.12 or better </li> <li> underlying work </li> <li> <A href="src/ssc-v115-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-10-17">21 10 17</time> </li> </ul> <h3>0.0.114</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v114-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-10-06">21 10 6</time> </li> </ul> <h3>0.0.113</h3> <ul> <li> RDFa with schema.org, but otherwise no core initial context (yet) </li> <li> restore progress report via -D switch </li> <li> underlying work </li> <li> <A href="src/ssc-v113-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-09-30">21 9 30</time> </li> </ul> <h3>0.0.112</h3> <ul> <li> control output format </li> <li> underlying work </li> <li> <A href="src/ssc-v112-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-09-21">21 9 21</time> </li> </ul> <h3>0.0.111</h3> <ul> <li> living standard july 2021 </li> <li> schema.org v 13.0 </li> <li> added --shadow.enable </li> <li> drop Visual Studio 2015 </li> <li> <A href="src/ssc-v111-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-08-21">21 8 21</time> </li> </ul> <h3>0.0.110</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v110-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-08-02">21 8 2</time> </li> </ul> <h3>0.0.109</h3> <ul> <li> update flag, so SSC only looks at files that have changed recently </li> <li> <A href="src/ssc-v109-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-06-29">21 6 29</time> </li> </ul> <h3>0.0.108</h3> <ul> <li> default version of HTML 5 switched to W3’s HTML 5.2. </li> <li> added example website update script </li> <li> specify which page content goes in the corpus </li> <li> <A href="src/ssc-v108-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-06-15">21 6 15</time> </li> </ul> <h3>0.0.107</h3> <ul> <li> SVG 1.2/Tiny </li> <li> partial SVG 1.2/Full (May 2004 draft): <ul> <li> conflicts with 1.2/Tiny always resolved in favour of 1.2/Tiny </li> <li> extensions parsed but not processed </li> <li> not complete, nor will it ever be </li> </ul> </li> <li> SVG 2.0 (August 2018) with: <ul> <li> December 2018 Filter Effects </li> <li> April 2021 Animations draft </li> </ul> </li> <li> SVG 2.0 (April 2021 draft) (2.1 to be?) with: <ul> <li> October 2019 Filter Effects </li> <li> April 2021 Animations draft </li> </ul> </li> <li> <A href="src/ssc-v107-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-06-05">21 6 05</time> </li> </ul> <h3>0.0.106</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v106-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-05-31">21 6 31</time> </li> </ul> <h3>0.0.105</h3> <ul> <li> improved diagnostics on abort </li> <li> underlying work </li> <li> <A href="src/ssc-v105-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-05-24">21 6 24</time> </li> </ul> <h3>0.0.104</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v104-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-05-18">21 6 18</time> </li> </ul> <h3>0.0.103</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v103-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-05-18">21 5 11</time> </li> </ul> <h3>0.0.102</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v102-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-05-03">21 5 3</time> </li> </ul> <h3>0.0.101</h3> <ul> <li> <a href="https://mathml-refresh.github.io/mathml/">MathML 4, Dec 2020 draft</a> (it’s early days; MathML 4 is really <A href="https://www.w3.org/TR/MathML/">MathML 3</a> with post–it notes) </li> <li> can run in the <A href="https://www.openbsd.org/">OpenBSD</a> 6.8 <A href="https://man.openbsd.org/httpd">httpd</a> server CGI environment (do NOT expose SSC to untrusted data sources, such as those on the open web, without taking serious precautions: SSC is pre–alpha software, and probably has more bugs than the Creator’s Ultimate All–Beetle Extravaganza) </li> <li> --shadow.changed: only update files in the shadow directory when the originals have changed </li> <li> expanded ligature suggestions now work across systems </li> <li> recognise the non–standard character codes <kbd>&bang;</kbd> <kbd>&hash;</kbd> <kbd>&splat;</kbd> <kbd>&squiggle;</kbd> (! # * ~) </li> <li> improvements to corpus data extraction </li> <li> <A href="src/ssc-v101-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-04-21">21 4 21</time> </li> </ul> <h3>0.0.100</h3> <ul> <li> can nitpick against WhatWG Living Standard April 2021 (except MathML 4 & SVG 2) </li> <li> expanded character code suggestions, particularly for ligatures (Windows only) </li> <li> improved aria attribute verification </li> <li> the environment variable SSC_CONFIG can specify a configuration file </li> <li> the environment variable SSC_ARGS can specify command line arguments </li> <li> specify custom elements and custom attributes (see recipe/toast/type/custom/* for example) </li> <li> dump site corpus with -d switch </li> <li> <A href="src/ssc-v100-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-04-16">21 4 16</time> </li> </ul> <h3>0.0.99</h3> <ul> <li> checks microformats in microdata </li> <li> export living standard & microformats microdata </li> <li> <A href="src/ssc-v99-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-04-11">21 4 11</time> </li> </ul> <h3>0.0.98</h3> <ul> <li> living standard microdata ITEMTYPEs processed </li> <li> stats now only counts reported errors </li> <li> <A href="src/ssc-v98-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-04-07">21 4 7</time> </li> </ul> <h3>0.0.97</h3> <ul> <li> living standard jan 2005 – jan 2021, mostly </li> <li> MathML 4 and SVG 2 are not currently understood </li> <li> various microdata, including vcard, vevent, purl.org and n.whatwg.org, are not currently understood </li> <li> no spellchecker </li> <li> <A href="src/ssc-v97-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-31">21 3 31</time> </li> </ul> <h3>0.0.96</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v96-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-26">21 3 26</time> </li> </ul> <h3>0.0.95</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v95-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-25">21 3 25</time> </li> </ul> <h3>0.0.94</h3> <ul> <li> <INPUT> PATTERN checks </li> <li> improved diagnosis output </li> <li> <A href="src/ssc-v94-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-20">21 3 20</time> </li> </ul> <h3>0.0.93</h3> <ul> <li> processes schema.org 12.0 microdata </li> <li> <A href="src/ssc-v93-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-12">21 3 12</time> </li> </ul> <h3>0.0.92</h3> <ul> <li> recognise open graph meta names </li> <li> expand mime type checking </li> <li> <A href="src/ssc-v92-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-16">21 3 16</time> </li> </ul> <h3>0.0.91</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v91-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-12">21 3 12</time> </li> </ul> <h3>0.0.90</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v90-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-12">21 3 12</time> </li> </ul> <h3>0.0.89</h3> <ul> <li> more media type / file extension checks </li> <li> underlying work </li> <li> <A href="src/ssc-v89-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-05">21 3 5</time> </li> </ul> <h3>0.0.88</h3> <ul> <li> added media type checks </li> <li> underlying work </li> <li> <A href="src/ssc-v88-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-03-02">21 3 2</time> </li> </ul> <h3>0.0.87</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v87-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-02-25">21 2 25</time> </li> </ul> <h3>0.0.86</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v86-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-02-15">21 2 15</time> </li> </ul> <h3>0.0.85</h3> <ul> <li> additional stats options, reporting <DT><DD>, <ABBR>>, & <DFN> content </li> <li> <A href="src/ssc-v85-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-02-12">21 2 12</time> </li> </ul> <h3>0.0.84</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v84-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-02-01">21 2 01</time> </li> </ul> <h3>0.0.83</h3> <ul> <li> verifies various new living standard referenced http-equiv pragmas </li> <li> <A href="src/ssc-v83-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-02-01">21 2 01</time> </li> </ul> <h3>0.0.82</h3> <ul> <li> --schema.version now accepts + for HTML+ </li> <li> <A href="src/ssc-v82-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-01-29">21 1 29</time> </li> </ul> <h3>0.0.81</h3> <ul> <li> --schema.version now accepts x.y style versions </li> <li> --schema.minor removed </li> <li> <A href="src/ssc-v81-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-01-24">21 1 24</time> </li> </ul> <h3>0.0.80</h3> <ul> <li> adds a (prototype) man page (recipe/tea/gen.txt) </li> <li> adds --stats.meta to generate stats on <META> usage in <HEAD> </li> <li> checks content-security-policy values </li> <li> <A href="src/ssc-v80-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-01-18">21 1 18</time> </li> </ul> <h3>0.0.79</h3> <ul> <li> A new -z switch to specify the maximum preferred length of title text; </li> <li> <A href="src/ssc-v79-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-01-15">21 1 15</time> </li> </ul> <h3>0.0.78</h3> <ul> <li> underlying work </li> <li> <A href="src/ssc-v78-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2021-01-03">21 1 3</time> </li> </ul> <h3>0.0.77</h3> <ul> <li> checks that the HTML page and the charset declared on it (if any) have something in common </li> <li> <A href="src/ssc-v77-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-12-20">20 12 20</time> </li> </ul> <h3>0.0.76</h3> <ul> <li> added --shadow.ignore to ignore files with specified extension </li> <li> <A href="src/ssc-v76-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-12-13">20 12 13</time> </li> </ul> <h3>0.0.75</h3> <ul> <li> added --microdata.root and --microdata.virtual for microdata exports </li> <li> Ubuntu Server 20.10 amd64 build </li> <li> default dedu cache now based on config file name </li> <li> underlying work </li><li> <A href="src/ssc-v75-sauce.tgz" type="application/gzip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-12-10">20 12 10</time> </li> </ul> <h3>0.0.74</h3> <ul> <li> can process <a href="https://schema.org">schema.org</a> 11.0 microdata; </li> <li> includes some microdata refinements. </li> <li> <A href="src/ssc-v74-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-12-06">20 12 6</time> </li> </ul> <h3>0.0.73</h3> <ul> <li> Export ‘repaired’ HTML files, including processing of Server Side Include directives; </li> <li> Deduplicate non–HTML files. When used with export, it copies one version of the file and modifies links appropriately. </li> <li> <A href="src/ssc-v73-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-12-05">20 12 5</time> </li> </ul> <h3>0.0.71</h3> <ul> <li> <A href="src/ssc-v71-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-11-25">20 11 25</time> </li> </ul> <h3>0.0.70</h3> <ul> <li> <A href="src/ssc-v70-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-11-22">20 11 22</time> </li> </ul> <h3>0.0.60</h3> <ul> <li> <A href="src/ssc-v60-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-10-29">20 10 29</time> </li> </ul> <h3>0.0.55</h3> <ul> <li> <A href="src/ssc-v55-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-10-20">20 10 20</time> </li> </ul> <h3>0.0.2</h3> <ul> <li> <A href="src/ssc-v2-sauce.zip" type="application/zip">download source</a> </li> <li> released <time itemprop="datePublished" class="dt-published" datetime="2020-04-20">20 4 20</time> </li> </ul> </section> <HR> <section id="boot"> <h2>boot notes</h2> <p> Notes on folder names: </p> <ul> <li> recipe: a nod to Vernor Vinge’s “<a href="https://en.wikipedia.org/wiki/A_Fire_Upon_the_Deep">A Fire Upon the Deep</a>” </li> <li> tea: without tea, nothing works; then there’s builders’ tea </li> <li> sauce: makes something dull quite delicious; identifies the arrogant; &, anyway, it’s obvious </li> <li> toast: toasts code; i like burnt toast </li> <li> heater: i’m not stopping now </li> </ul> </section> <HR> <section id="copy"> <h2>copyright & licence</h2> <P> Any dispute shall be resolved in accordance with <A href="https://uk.practicallaw.thomsonreuters.com/w-018-2382">the law of the Grand Duchy of Luxembourg</A>. </P> <pre> <a href="https://ssc.lu/">SSC</a> SSC, static site checker, <a href="https://ssc.lu/">https://ssc.lu/</a> copyright (c) 2020-2022 <a href="https://harris.eu.com/">dylan harris</a> This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA <a href="https://whatwg.org/">W3</a> Some test files come from <a href="https://whatwg.org/">w3.org</a> (some directly, in W3 documents, etc.), and are licensed as follows: License By obtaining and/or copying this work, you (the licensee) agree that you have read, understood, and will comply with the following terms and conditions. Permission to copy, modify, and distribute this work, with or without modification, for any purpose and without fee or royalty is hereby granted, provided that you include the following on ALL copies of the work or portions thereof, including modifications: The full text of this NOTICE in a location viewable to users of the redistributed or derivative work. Any pre-existing intellectual property disclaimers, notices, or terms and conditions. If none exist, the W3C Software and Document Short Notice should be included. Notice of any changes or modifications, through a copyright statement on the new code or document such as "This software or document includes material copied from or derived from [title and URI of the W3C document]. Copyright © [YEAR] W3CÆ (MIT, ERCIM, Keio, Beihang)." Disclaimers THIS WORK IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE OR DOCUMENT WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE SOFTWARE OR DOCUMENT. The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to the work without specific, written prior permission. Title to copyright in this work will at all times remain with copyright holders. Notes This version: http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document Previous version: http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231 This version makes clear that the license is applicable to both software and text, by changing the name and substituting "work" for instances of "software and its documentation." It moves "notice of changes or modifications to the files" to the copyright notice, to make clear that the license is compatible with other liberal licenses. <a href="https://whatwg.org/">WhatWG</a> Some test files come from <a href="https://whatwg.org/">whatwg.org</a> (some directly, in WhatWG documents, etc.), and are licensed under a Creative Commons Attribution 4.0 International License. See <a href="https://whatwg.org/">https://whatwg.org/</a> for details. <a href="https://corruptpress.com">corruptpress.com</a> Some test files are derived from pages at <a href="https://corruptpress.com">corruptpress.com</a>. They are licensed under a Creative Commons Attribution 4.0 International License. Browse <a href="https://corruptpress.com">https://corruptpress.com/</a> for details. <a href="https://dylanharris.org">dylanharris.org</a> Some test files are derived from pages at <a href="https://dylanharris.org">https://dylanharris.org/</a>. They are licensed under a Creative Commons Attribution 4.0 International License. Browse <a href="https://dylanharris.org">https://dylanharris.org/</a> for details. </pre> </section> <HR> </main> <footer class="smaller"> <P> <A href="https://harris.eu.com/">Dylan Harris</a> is registered as a sole trader in Luxembourg <BR> BP 10133642/0   RCS A43134   VAT soon <BR><BR> © 2020-2022 <A href="https://dylanharris.org/">dylan harris</A><BR> themed with <a href="https://dylanharris.org/past/xenakis/1996/index.htm">retro 96</A><BR> <A href="mailto:dylan@harris.eu.com">email</A> and <A href="https://dylanharris.org/and/and/contact.shtml">contact</A> </P> </footer> </body> </html>