content
introduction
why
README
usage
known issues
bug reporting
build
download
boot notes
copyright & licence
introduction
The static site checker is an opinionated HTML nitpicker, a command–line tool to validate static HTML & XHTML websites. I built it to nitpick arts & ego, my hand–coded identity website.
It should not be used on untrusted content; its parsers are holier than Robin’s cow.
If you want to try it, here’s the current source. The build instructions follow.
Dylan Harris
October 2024
why SSC
Why did I make the static site checker? Aren’t there a lot of other HTML validators around? When I checked a few years ago; I couldn’t find a web site validator, only web page validators. Things may have improved. Anyway, my google foo is poo.
My identity website has more than 100,000 pages. I’m too impatient to push each through a validator one by one; I want to validate my site as a whole. Furthermore, single page validators can’t catch inter–page errors, such as broken internal links, let alone hidden links (an otherwise valid link to a HIDDEN element).
Many people avoid such problems by using frameworks. I find frameworks awful. IMAO, they produce dull, boring, trite design. The visual arts world has had centuries to develop excellent form for a rectangular space. Most 21st frameworks are so crude they haven’t even absorbed 14th century visual arts’ ideas, when painters broke out of rectangular form in a rectangular frame. So much is possible, so much hasn’t happened. I want to break this dull, stultifying, archaic, mutton.
Maybe I’m making the wrong comparison, that the web isn’t about image, it’s about type. The Western visual arts never did really suss mixing writing and form (that’s not really true, but, IMAO, such arts never escaped their context). However, the Eastern visual arts most certainly did, and frameworks haven’t noticed them either.
Enough of this. Rather than criticising other people for not doing, I should do. I need to make some example sites. That’s where SSC comes in.
If I am to build a site using an experimental visual process, I can’t use frameworks. If I can’t use frameworks, I have to hand code. And there’s the key problem: HTML is such a convoluted, evolved mess, that the people who designed it, in their own design presentations, make errors. Ok, I only found this out by testing SSC on them, which conveniently illustrates that HTML is overcomplicated. I’m not going to reveal names because these people are working hard to make the web a better place. Let’s just say W3 had broken links, WhatWG referenced withdrawn ontologies, and many other authors’ sites have other internal inconsistencies. That the people who define the web make mistakes using their own design in their own documents that espouse their design, helps explain why most stick to dull, formulaic, boring, frameworks. To be fair, my HTML is far worse than any of these mild examples of technical naughtiness, which is why I had to write SSC.
I’ve yet to build a site inspired by visual art’s form and layout. My efforts have been spent building SSC, a tool to make that practical.
Since I’m here, I’ll list other issues I have with frameworks:
- They have to be regularly maintained. Every time an update comes out, that update has to be applied to a site, or the site becomes vulnerable to the exploits published by the update. Time is lost playing patch catchup.
- Updates for frameworks sometimes fail; instead of fixing issues, they break sites. That’s why I dropped Drupal. That’s why I swore off NextCloud—to which I returned when NitroPhone made maintenance an SEP.
- Frameworks use scripted languages, such as PHP. There were, in October 2021, five known vulnerabilities in PHP (far better than time was). Other frameworks use other scripted languages with other problems.
- IMAO, frameworks’ and scripts’ worst flaw is many import code at runtime from third–party repositories. Their integrity and security is dependent on that of the repository, over which a site owner (usually) has no control. There are many examples of repositories being hacked, breaking the security and integrity of all sites that use them.
Dylan Harris
January 2024
README
Static Site Checker (an opinionated HTML nitpicker) version 0.2.4 (c) 2020-2024 dylan harris see LICENCE.txt & LICENSE.txt for copyright & licence notice https://ssc.lu/ https://github.com/devongarde/ssc ssc analyses static HTML snippets, files and sites: — HTML living standard, Jan 2005 to Oct 2024 — HTML 1.0/+/2.0/3.0/3.2/4.00/4.01/5.0/5.1/5.2/5.3–draft — CSS 1/2.0/2.1/2.2–draft, 2007-2023 snapshots, more — SVG 1.0/1.1/1.2 Tiny/1.2 Full/2.0/2.x–draft Apr 2021 — MathML 1/2/3/4–draft Jul 2022 — XHTML 1.0/1.1/2.0/5.x — finds broken links — server side includes, mostly — many ontologies with opinions on: — standard english where dialect is required — perfectly legal but sloppy HTML — abhorrent rudeness such as autoplay on videos It does NOT: — analyse or understand scripts — analyse or understand XML or derivatives, except as noted above It can output: — ‘repaired’ HTML (not XHTML) — HTML with resolved server side includes — JSON of ontological content — website statistical information — deduplicated websites ssc -h for a usage summary. ssc -f config_file analyse site using preprepared configuration ssc directory analyse website based in directory To build & run: 1. Follow the build instructions in build.txt 2. Gleefully run ssc. It will misbehave if you are insufficiently gleeful. This is an alpha version of ssc. It may contain unexpected features. If you encounter such a delight, please help improve ssc by collecting the following information (where relevant): — version of ssc; — precise version of the operating system; — hardware architecture and system information; — detailed description of the problem; — detailed description of the steps to recreate it; — copy of output file showing the error; — copy of pages/website being analysed; — precise command used; — configuration file(s) used, if any; — any ndx file or other pre–existing file used during the run; — any known workarounds or solutions; — optionally, a dance interpretation of the ‘feature’; and emailing everything to mail@ssc.lu (if the collected files are more than small, please use a public fileserver and email the link). Do NOT send anything confidential. Furthermore, unless you state otherwise, we reserve the right to publish some or all of the information sent in future versions of ssc, usually in the test suite. If you have a fix, you are invited to submit a pull request on github. Thank you. SSC can be run in a CGI environment. This is intended for use with OpenBSD’s native httpd web server. You are reminded that SSC is α software. Do NOT expose it to untrusted data sources, such as the open web, without taking serious precautions. SSC probably has more bugs than the Creator’s Ultimate All–Beetle Extravaganza (J.B.S. Haldane, apocryphal : “[the Creator has] an inordinate fondness for beetles.”). Notes on names: — recipe: a nod to Vernor Vinge’s “A Fire Upon the Deep”; — tea: without tea, nothing works; then there’s builders’ tea; — sauce: makes the dull tasty; identifies incompetent pedants; — toast: toasts code; i liked burnt toast; — heater: i’m not stopping now; — unii: my preferred plural of unix: to my ears, both unixes and unices sound like they sing castrato. — andor: and/or sans ancienne; land of Gift (aber nicht das Gift) SEE ALSO build.txt notes on building ssc gen.txt a model man page usage.txt how to use ssc releasenotes.txt chips LICENCE.txt ssc licence information LICENSE.txt formal GPL 3 licence more licences licences for borrowed external content Background I have a website, arts & ego, at https://dylanharris.org/. It has approaching 60G of original content. It contains hand coded HTMLs 2, 3, 4 & 5. It is a complete mess. Despite a long search, I could not find any tools to properly identify its flaws. Anything I did find was at most cursory. Then came the cow flu*. *corvid means crow, thus covid means cow**. **by the rules of sympathetic spelling. Unabashed Opportunism If you appreciate modernist poetry or abstract photography, I’ve been published. Click on books at arts & ego for gen. written by dylan harris mail@ssc.lu October 2024
usage
NAME ssc - static site checker SYNOPSIS ssc [...] directory ssc -f config ssc DESCRIPTION ssc (the Static Site Checker) is an opinionated HTML nit-picker, intended for people, like its author, who hand code websites. It doesn't just check static sites for broken links, dubious syntax, and bad semantic data, it will actively complain about things that are perfectly legal but rather untidy, like its author. Except when serving CGI queries, it recursively scans the directory seeking HTML & related files to analyse. It produces a list of errors, warnings, and other hints of imperfection. Scripts are ignored. COMMAND LINE ONLY SWITCHES These options are only available on the command line: -a Ask the user to enter arguments, answer them; ask and answer again, until a blank line is entered. Arguments entered on the command line (with -a) will be processed as normal. Certain arguments entered during the argument answer cycle will be ignored, including -a and thread counts. (likely to be withdrawn) -A List switches read, then exit. -f file Load a configuration from a file, which should be in .INI file format. See CONFIGURATION FILE FORMAT below. This should be an absolute path. -F Load the configuration file .ssc/config in the current directory. -h Show a summary of switches, then exit. -H snippet Only nitpick this snippet of HTML. --ontology.list List known schema versions, then exit. -q Use a simple shell. The shell accepts the following commands: c configure: enter a series of command line switches, one per line, then a full stop on a line by itself C clear the current configuration h a summary of available commands p print the current configuration q quit r run using the current configuration -V Show version details, then exit. --validation List extendable attribute types, then exit. These types accept additional values on some X/HTML attributes and CSS properties. It is intended to allow checking of HTML etc. with bespoke extensions. COMMAND LINE AND CONFIGURATION FILES SWITCHES These options are available on the command line (with dashes) and in configuration files (without dashes). The short form single letter alternative switches only work on the command line. Most binary options, e.g. those without arguments below that turn on a feature (which may be the default), have a corresponding "no-" switch to turn it off. The "no-" is inserted after the dot, so, for example, the contradiction to "--general.noh" would be "--general.no-noh". When both are specified, perhaps in a configuration file and on the command line, the "no-" switch always applies. Corpus Corpus switches control XML data for output to a local search engine. --corpus.article Prefer the content of <ARTICLE> when gathering corpus text. --corpus.body Prefer the content of <BODY> when gathering corpus text. This is the default. --corpus.main Prefer the content of <MAIN> when gathering corpus text. --corpus.output file Dump XML corpus of site into file. This is intended for use by a local search engine. If none of --corpus.article, --corpus.body, or --corpus.main are specified, the content of <BODY> is used. If more than one are specified, then the text collected depends on a page's content. This is incompatible with --shadow.update. CSS The CSS switches precisely control CSS interpretation. If you are checking a site with a CSS version contemporary to the given HTML version (see --html.version), you can ignore them. Otherwise, you probably only need --css.version. The other switches allow you to precisely specify the CSS modules presumed. Specific modules are defined at w3.org. --css.adjust X Use CSS Colour Adjustment level X, where X is 0 or 3. --css.anchor X Use CSS Scrollbar Anchoring level X, where X is 0 or 3. --css.animation X Use CSS Animation level X, where X is 0, 3 or 4. --css.background X Use CSS Backgrounds and Borders level X, where X is 0 or 3. --css.box-align X Use CSS Box Alignment level X, where X is 0 or 3. --css.box-model X Use CSS Box Model level X, where X is 0, 3 or 4. --css.box-sizing X Use CSS Box Sizing level X, where X is 0, 3 or 4. --css.cascade X Use CSS Cascading and Inheritance level X, where X is 0, 3, 4, 5 or 6. --css.colour X Use CSS Colour level X, where X is 0, 3, 4 or 5. --css.compositing X Use CSS Compositing and Blending level X, where X is 0 or 3. --css.cond-rule X Use CSS Conditional Rules level X, where X is 0, 3, 4, or 5. --css.contain X Use CSS Contain level X, where X is 0, 3, 4 or 5: see --css.version for gen. --css.content X Use CSS Generated Content level X, where X is 0 or 3. --css.cs X Use CSS Counter Style level X, where X is 0 or 3. --css.custom X Use CSS Custom Properties for Cascading Variables level X, where X is 0 or 3. --css.device X Use CSS Device Adaption level X, where X is 0 or 3. --css.display X Use CSS Display level X, where X is 0 or 3. --css.ease X Use CSS Easing Functions level X, where X is 0 or 3. --css.exclude X Use CSS Exclusions level X, where X is 0 or 3. --css.extension ext Presume files with extension '.ext' are CSS files. --css.fbl X Use CSS Flexible Box Layout level X, where X is 0 or 3. --css.filter X Use CSS Filter Effects level X, where X is 0 or 3. --css.float X Use CSS Page Floats level X, where X is 0 or 3. --css.font X Use CSS Fonts level X, where X is 0, 3, 4 or 5. --css.frag X Use CSS Fragmentation level X, where X is 0, 3. or 4 --css.grid X Use CSS Grid level X, where X is 0, 3 or 4: see --css.version for gen. --css.highlight X Use CSS Custom Highlights level X, where X is 0 or 3. --css.image X Use CSS Images level X, where X is 0, 3 or 4. --css.inline X Use CSS Inline Layout level X, where X is 0 or 3. --css.line-grid X Use CSS Line Grid level X, where X is 0 or 3. --css.list X Use CSS Lists and Counters level X, where X is 0 or 3. --css.logic X Use CSS Logical Properties level X, where X is 0 or 3. --css.marquee X Use CSS Marquee level X, where X is 0 or 3. --css.masking X Use CSS Masking level X, where X is 0 or 3. --css.media X Use CSS Media Queries level X, where X is 0, 3, 4 or 5. --css.mobile Test against the CSS Mobile Profile. --css.multi-column X Use CSS Multi-Column level X, where X is 0 or 3. --css.namespace X Use CSS Namespaces level X, where X is 0 or 3. --css.nes X Use CSS Non-Element Selectors level X, where X is 0 or 3. --css.overflow X Use CSS Overflow level X, where X is 0, 3 or 4. --css.overscroll X Use CSS Overscroll Behaviour level X, where X is 0 or 3. --css.page X Use CSS Paged Media level X, where X is 0 or 3. --css.position X Use CSS Positions level X, where X is 0 or 3. --css.present X Use CSS Presentation Levels level X, where X is 0 or 3. --css.print Test against the CSS Print Profile. --css.region X Use CSS Regions level X, where X is 0 or 3. --css.rhythm X Use CSS Rhythmic Sizing level X, where X is 0 or 3. --css.round X Use CSS Round Display level X, where X is 0 or 3. --css.ruby X Use CSS Ruby Annotations level X, where X is 0 or 3. --css.scope X Use CSS Scoping level X, where X is 0 or 3. --css.scrollbar X Use CSS Scrollbar Style level X, where X is 0 or 3. --css.sda X Use CSS Scroll-Driven Animations Style level X, where X is 0 or 3. --css.selector X Use CSS Selectors level X, where X is 0, 3 or 4. --css.shadow X Use CSS Shadow Parts level X, where X is 0 or 3. --css.shape X Use CSS Shapes level X, where X is 0, 3 or 4. --css.snap X Use CSS Scroll Snap level X, where X is 0 or 3. --css.spatial X Use CSS Spatial Navigation level X, where X is 0 or 3. --css.speech X Use CSS Speech level X, where X is 0 or 3. --css.style X Use CSS Style level X, where X is 0 or 3. --css.syntax X Use CSS Syntex level X, where X is 0 or 3 --css.table X Use CSS Tables level X, where X is 0 or 3 (this is an experimental spec, likely to change). --css.text X Use CSS Text level X, where X is 0, 3 or 4. --css.text-dec X Use CSS Text Decoration level X, where X is 0, 3 or 4. --css.tv Test against the CSS TV Profile. --css.transform X Use CSS Transforms level X, where X is 0, 3 or 4: see --css.version for gen. --css.transition X Use CSS Transitions level X, where X is 0 or 3. --css.ui X Use CSS Basic User Interface level X, where X is 0, 3 or 4. --css.value X Use CSS Values and Units level X, where X is 0, 3 or 4. --css.verify Verify CSS files (replaces --general.css). --css.version X Presume version X of CSS, where X is one of: 1 CSS 1.0 2.0 CSS 2.0 2.1 CSS 2.1 2.2 CSS 2.2 (Feb 2022 draft) 3 all CSS level 3 so far 4 all CSS level 4 so far 5 all CSS level 5 so far 6 all CSS level 6 so far 2007 2010 2015 2015+ 2015++ 2017 2017+ 2017++ 2018 2018+ 2018++ 2020 2020+ 2020++ 2021 2021+ 2021++ 2022 2022+ 2022++ 2023 2023+ 2023++ 2024 2024+ 2024++ The years are CSS snapshots, whether the year itself for stable modules, with + for wobbly modules, and ++ for wibbly-wobbly modules, as per the corresponding W3 CSS snapshots (the terminology in those snapshots is inconsistent, hence our use of the scientific terms wobbly and wibbly-wobbly). For levels 3, 4, 5 and 6, note that extensions that are part of neither CSS 1 nor CSS 2.x specifications are numbered three and upwards in ssc, for internal consistency. If you wish to use an extension named ... level 1, that is not part of CSS 1, specify 3. Similarly, for those named level 2 that are not part of any CSS 2 specification, etc.. --css.view X Use CSS View Transitions level X, where X is 0 or 3. --css.wc X Use CSS Will Change level X, where X is 0 or 3. --css.writing X Use CSS Writing Mode level X, where X is 0, 3 or 4. General switches This are switches that don't really belong in any other section. --general.class Nitpick class values. --general.classic Report all classes used, not just those in CSS files. --general.cgi Check environment variables for snippets of -W HTML. SSC expects environment variables as produced by OpenBSD's native httpd, produced using <FORM METHOD=GET ...>. Do NOT let ssc anywhere near untrusted data. Ignores many options such as shadowing. --general.datapath dir Look for any configuration, caches, and other -C dir useful files, in this directory. --general.defthrd N If --general.thread is not given, then set the -Y N number of threads to N. The default is 1. If 0 is specified, then select a number of threads not entirely inappropriate for the hardware. --general.exclude xxx Ignore all paths containing xxx. May be repeated. Case independent under Windows only. .DS_Store is always excluded under darwin. --general.file XXX File for persistent data. See also --general.datapath. Default extension: .ndx. --general.info Report launch context when starting. --general.maxfilesize n Do not process HTML source files that exceed n bytes in size (default: 4M). Specify 0 for unlimited, although be warned that ssc is stunningly stupid in such circumstances and may even attempt to load files bigger than available memory. --general.output file Output to the specified file. If this switch is -o file not used, standard output is used. --general.progress Dump progress information to standard output. -D This can interfere with formatted output. --general.rdfa Check RDFa attributes (version 1.1.3). This is intended for ontology testing only, so is incomplete. --general.rpt Report CSS files that are opened. --general.spec Reset the values of most switches to false. -j --general.test Output data in automated test format. Used by -T ssc-test. Not generally useful. Documented so you can avoid using it! --general.thread N Use N threads when running. Defaults to 1. If -y N 0 is given, a value not entirely inappropriate for the hardware is used. Too high a value can cause problems. See also --general.defthrd. --general.vcs Excludes, as per --general.exclude, files and directories called: .bazaar .bk CVS .cvsignore _darcs .fslckout .git .gitattributes .gitignore .gitmodules .pijul RCS SCCS .svn HTML The only HTML switch you are likely to need is --html.version, and then only if you want to check a site that is not contemporary to the build of ssc. The remaining switches allow you to precisely control analysis of older sites. --html.custom EL Define a custom element <EL> for verifying the IS attribute. May be repeated. --html.force If <!DOCTYPE...> is missing, force presumption of --html.version value, not HTML 1/tags --html.ie Don't mention certain Internet Explorer 'features'. --html.ignore EL Ignore attributes and content of the element <EL>. May be repeated. --html.lang LA If an X/HTML file does not have a language / dialect specified (e.g. "en" for generic English, "en-IE" for Irish English, "lb-LU" for Luxembourgish, "ma" for Marain, etc.), default to 'LA'. If not given, the default is your system default, or, if none, then "en-US". --html.rel Only mention <LINK> REL values, found neither in the living standard nor at microformats.org, in debug output. --html.rfc1867 Ignore the RFC 1867 (INPUT=FILE) extension when processing HTML 2.0 --html.rfc1942 Ignore the RFC 1942 (tables) extension when processing HTML 2.0. --html.rfc1980 Ignore the RFC 1980 (client side image maps) extension when processing HTML 2.0. --html.rfc2070 Ignore the RFC 2070 (internationalisation) extension when processing HTML 2.0. --html.ruby Accept Ruby Markup Extension (draft, late April 2024) for HTML from May 2024 onwards. --html.safari Don't mention certain early Safari 'features'. --html.sloven Ignore perfectly legal yet inefficient, indeed thoroughly slovenly, HTML, such as being far too lazy to bother to get round to closing elements. --html.ssi Process Server Side Includes (SSIs). Note ssc -I cannot process SSIs directives with formulae. Processing SSIs may cause incorrect line numbers to be mentioned when an issue is reported. --html.tags When an HTML file is loaded that contains no DOCTYPE, ssc normally presumes HTML 1. This switch tells it to presume the file conforms to an earlier HTML Tags specification (the one at CERN). This is overridden by --html.version. --html.title n If <ITLE> text is longer than n characters, -z n say so. This applies to text enclosed by a <TITLE> element under <HEAD>, not the value of TITLE attributes. --html.version X If no doctype (or xml header) is specified, presume version X of HTML. X can be: tags HTML tags (1991, informal) 1 HTML 1.0 (Jun 1993 draft) 1.0 HTML 1.0 (Jun 1993 draft) + HTML Plus (Nov 1993 draft) 2 HTML 2.0 2.0 HTML 2.0 3 HTML 3.2 3.0 HTML 3.0 (Mar 1995 draft) 3.2 HTML 3.2 4 HTML 4.01 4.0 HTML 4.0 4.1 HTML 4.01 4.2 XHTML 1.0 4.3 XHTML 1.1 core 4.4 XHTML 2.0 (Dec 2010 draft) 5 recent WhatWG HTML 5 5.0 W3 HTML 5.0 5.1 W3 HTML 5.1 5.2 W3 HTML 5.2 5.3 W3 HTML 5.3 (Oct 2018 draft) 2005/1/1 WhatWG WebApps draft (Jan 2005) ... (halfly) 2007/1/1 WhatWG WebApps draft (Jan 2007) 2007/7/1 WhatWG HTML 5 (Jul 2007) ... (halfly) 2021/1/1 WhatWG HTML 5 (Jan 2021) ... (quarterly) 2024/4/1 WhatWG HTML 5 (Apr 2024) XHTML 1.0 XHTML 1.0 XHTML 1.1 XHTML 1.1 core XHTML 2.0 (Dec 2010 draft) XHTML 5.x XHTML corresponding to equivalent W3 HTML Although you can specify exact dates for versions of the WhatWG HTML 5 living standard, currently only broad versions published in January and July are supported (quarterly from 2021). Certain versions of HTML offer variants, such as loose and strict definitions. ssc picks those up from the <!DOCTYPE ...> in the HTML file, if any, and then carefully ignores them. Validation of XHTML is even less strict. Just to remind you, there are no guarantees of accuracy (or inaccuracy). Copies of the appropriate standards can be found online. A copy of the copies referenced during ssc's development can be found at https://ssc.lu/. Link switches If you want to check links on the site, you'll find these switches useful, particularly --link.external. --link.check is a must, it spots broken links within the site. --link.301 Normally, when ssc checks external links -3 (--link.external), it does not report http forwarding errors 301 and 308. Use this switch to have it do so. --link.check Check internal links, e.g. those within the -l website being analysed. --link.example Report links to faux domains, as defined by RFC 2606 (note ssc also reports links to example.edu, example.gov & example.mil). --link.external Check external links, e.g. those not on the -e site being checked. Note that ssc will NOT check RFC 2606 links, such as example.com (see --link.example). --link.forward Report HTTP forwarding errors encountered when checking external links (e.g. 301 and 308) --link.ignore DOMAIN When checking external links, ignore this domain. May be repeated. --link.local Report links to local domains, such as domains ending in .lan, .home, .corp, and others. --link.once Only report each broken external link once. If, -O for example, the site has a number of references to a page that does not exist, ssc will only report the first instance of the broken link. Note that, even if it reports every occurrence of the link, it will only check it the first time it's encountered (requires --link.external). --link.pretend FILE Pretend links containing xxx exist. May be repeated. --link.report DOMAIN Report links to domain and its descendents. May be repeated. --link.revoke Do not check whether links' https certificates -r have been revoked (requires --link.external). --link.xlink Check crosslink IDs on the site being analysed. -X For example, if a link goes to /index.html#id, then, when this switch is set, ssc will verify that the id exists and that it is not hidden. MathML switches These switches are useful when you have some MathML which is not contemporary to the corresponding HTML. --math.version N Presume version N of MathML (1, 2, 3 pr 4). The following versions are supported: 0 based on the HTML version 1 MathML 1 2 MathML 2 3 MathML 3 4.20 MathML 4 2020 draft 4 MathML 4 2022 draft core MathML 4 core (May 2022 draft) Microformat switches These switches are useful for checking andor outputting any microformat data found. --microformat.export Export microformat data encountered in JSON format. This option will write files in the same directory as the source, with the extension .json. --microformat.verify Verify Microformats data in class and rel -M attributes (see https://microformats.org/). --microformat.version x Presume microformats version x. The following values are currently accepted: 1 microformats version 1 only 2 microformats version 2 only 3 both microformats versions 1 and 2 Nits Nits are the output of ssc, the static site NITpicker. You will need these switches if you want to hide certain nits, output lots of extra gen, etc.. --nits.abhorrent n redefine nit n as an abhorrence; may be repeated (the value of n can be determined using --nits.nids below). --nits.catastrophe n redefine nit n as a catastrophe; may be repeated (the value of n can be determined using --nits.nids below). --nits.comment n Redefine nit n as a comment; may be repeated (the value of n can be determined using --nits.nids). --nits.debug n Redefine nit n as a debug message; may be repeated (the value of n can be determined using --nits.nids). --nits.error n Redefine nit n as an error; may be repeated (the value of n can be determined using --nits.nids). --nits.errorexit x If nits of the specified category or worse are -E generated, then, on exit, return an error code. Values are: 'catastrophe', 'error' (the default), 'warning', 'info', or 'comment'. --nits.expand Expand text content of certain nits. --nits.extra Report additional nits. --nits.format F Specify the output format; F is a template file (see OUTPUT TEMPLATE below). --nits.info n Redefine nit n as information; may be repeated (the value of n can be determined using --nits.nids). --nits.nids Output nit ids, which can be used to redefine nits. --nits.override F Use this output format, not the one specified by --nits.format. F is a template file (see OUTPUT TEMPLATE below). This switch is intended to aid automation. --nits.quote X Specify quote style when using nit.format. X can be 'text' or 'html'. --nits.root By default, seek nit output template files in the website root. --nits.silence n Silence nit n; may be repeated (the value of n can be determined using --nits.nids). --nits.unique Do not output repeated nits, even if they may contain additional information. --nits.verbose x Output nits to the specified verbosity: -v 'catastrophe', 'abhorrent', 'error', 'warning', 'info' (the default), 'comment', or '0' for silence. Additional values are available when debugging. Each level includes its preceding level, so, for example, 'warning' will also output 'catastrophe', 'abhorrent', and 'error' nits. --nits.warning n Redefine nit n as a warning; may be repeated (the value of n can be determined using --nits.nids). --nits.watch Output debug nits (intended for automation). Ontology switches If you are interested in checking andor hoovering ontology data, you may find these switches useful. Note that ssc only knows about certain ontologies (see --ontology.list). --ontology.export Export ontologies encountered. This data is exported in JSON format (not JSON-LD). --ontology.root DIR When exporting ontologies with --ontology.export, write files into the directory DIR. ssc will create the directory tree structure as appropriate. --ontology.verify Check ontology found in WhatWG living standard microdata attributes (itemprop, itemtype, etc.). --ontology.virtual v=d When exporting ontologies using --ontology.export, export the contents of virtual directory 'v' to 'd'. 'v' must match a directory identified with --site.virtual. For example: --ontology.virtual virtual=X:\virtual. --ontology.ONT X.Y Presume version X.Y of ontology ONT. For example: --ontology.xsd 1.1 defaults usage of XSD to version 1.1. This versioning applies to RDFa, microdata, and microformats (using class) analysis. If .Y is omitted, .0 is presumed. X must be present. Unspecified defaults are derived from the HTML version. For a list of possible values, use --ontology.list. At the time of writing, the following ontology versions can be verified. Note that single version ontologies cannot have their version changed: adms 1.0,2.0 article 12,14,18,22 as 1.0,2.0 basic 1.0-1.3,2.1,3.0 (see below) bfo 2.0,2020 (see below) bibo 1.3 biro 1.1 book 12,14,18,22 cc 1.0 cito 2.8 content 1.0 crs 1.0 (see below) csvw 1.0 ctag 1.0 daq 1.0 ddi 1.0 dbp 1.0 dbp-owl 1.0 dbr 1.0 dc11 1.0,1.1 dcam 1.0 dcat 1.0,2.0 dcmi 1.0 dcterms 1.0,1.1 ddi 1.0 doap 1.0 dpv* 0.1-2.0 (see below) dqv 1.0 describedby 1.0 duv 1.0 earl 1.0 event 1.0 exif 1.0-3.0 (see below) exifex 2.21-3.0 (see below) foaf 0.1-0.99 frbr_core 1.0 gr 1.0 grddl 1.0 gs1 1.1-1.5 ical 1.0 icaltzd 1.0 jsonld 1.0,1.1 ldp 1.0 license 1.0 locn 1.0 ma 1.0 mf 1.0-2.255 music 12,14,18,22 oa 1.0 odrl 1.0 og 10,12,14,18,22 (see below) org 1.0 owl 1.0,2.0 pam 2.0 (see below) pcm 3.1 (see below) pcmm 3.0 (see below) pcv 1.0(see below) pdf 1.0 (see below) photoshop 1.0 (see below) pim 1.0-3.0 (see below) pmi 3.0 (see below) poetry 1.0,1.1 prism 1.0-3.0 (see below) prism-ad 3.0 (see below) prl 1.0-2.0 (see below) prm 3.0 (see below) prs 3.1 (see below) profile 12,14,18,22 prov 1.0 psv 1.0 (see below) ptr 1.0 pur 2.1-3.0 (see below) qb 1.0 rdf 1.0-1.3 rdfa 1.0-1.3 rdfg 1.0 rdfs 1.0 rev 1.0 rif 1.0 role 1.0 rr 1.0 schema.org 0.10-28 (see below) sd 1.0 sioc 1.0 sioc_s 1.0 sioc_t 1.0 skos 1.0 skosxl 1.0 sosa 1.0 ssn 1.0 stdim 1.0 (see below) stevt 1.0 (see below) stfnt 1.0 (see below) stjob 1.0 (see below) stref 1.0 (see below) stver 1.0 (see below) taxo 1.0 tiff 6.0 time 1.0 v 1.0 vann 1.0,1.1 vcard 1,2,3,4 (see below) video 12,14,18,22 void 1.0 wdr 1.0 wdrs 1.0 website 12,14,18,22 wwg 1.0 xhv 1.0 xml 1.0 xmp 1.0 (see below) xmpdm 1.0 (see below) xmpg 1.0 (see below) xmpgimg 1.0 (see below) xmpidq 1.0 (see below) xmpmm 1.0 (see below) xmprights 1.0 (see below) xmptpg 1.0 (see below) xsd 1.0,1.1 The various Adobe ontologies (crs, pdf, photoshop, stdim, stevt, stfnt, stjob, stref, stver, smp, xmpdm, xmpg, xmpgimg, xmpidq, xmpmm, xmprights, xmptpg) have only been partially applied. They do not seem to have been designed for microdata, hence the partial implementation: the goal is to enable hoovering to JSON. BFO (Basic Format Ontology) versions should be specified as follows: Use For 2.0 2.0 2.2 2020 BFO 2020 uses OBO's machine code style identifiers. Given the history of computing science, as a convenience for users, and with my experience of both devops and maintaining code, identifiers following the standard ontology naming convention are also accepted. Since this is unofficial, both standard English and American dialect spellings are processed. The data privacy family of ontologies follow this versioning scheme: Use For 0.10 0.1 0.20 0.2 0.30 0.3 0.40 0.4.0 0.41 0.4.1 0.42 0.4.2 0.50 0.5 0.60 0.6 0.70 0.7 0.80 0.8.0 0.81 0.8.1 0.82 0.8.2 0.90 0.9 1.0 1 2.0 2 The data privacy ontology versions: ai 2 dpv 0.1-2 eu-aiact 2 eu-dga 2 eu-gdpr 2 eu-nis2 2 eu-rights 2 gdpr 0.1-1 justifications 2 legal 0.5-1 legal-de 2 legal-eu 2 legal-gb 2 legal-ie 2 legal-in 2 legal-us 2 loc 2 nace 0.1-1 pd 0.4-2 rights-eu 0.8-2 risk 0.8-2 tech 0.8-2 The Exif & ExifEx ontologies have the following versions: Use For 1.0 1.0 (exif only) 1.1 1.1 (exif only) 2.0 2.0 (exif only) 2.10 2.1 (exif only) 2.20 2.2 (exif only) 2.21 2.21 2.30 2.3 2.31 2.31 2.32 2.32 3.0 3.0 Manufacturers' extensions to EXIF are omitted, with exceptions. Open Graph versions correspond to snapshots of the specs from 2010, 2012, 2014, 2018 & 2022. The various Prism ontologies (pam, pamp, pcm, pcmm, pcv, pim, pmi, prism, prism_ad, prl, prm, prs, psv, pur) have only been partially applied: some specifications are unavailable, some specifications break HTML5 syntax. Prism was not designed for microdata, hence the partial implementation: the goal is to enable hoovering to JSON. Most versions of schema (schema.org) should be specified by their version number, but this doesn't work with early versions, which should be specified a follows: Use For 0.10 June 2011 0.15 July 2011 0.20 August 2011 0.25 September 2011 0.30 October 2011 0.35 November 2011 0.40 December 2011 0.45 January 2012 0.50 February 2012 0.55 March 2012 0.60 April 2012 0.91-0.99 as version number 1.0 1.0a 1.1 1.0b 1.2 1.0c 1.3 1.0d 1.4 1.0e 1.5 1.0f 1.10 1.1 1.20 1.2 1.30 1.3 1.40 1.4 1.50 1.5 1.60 1.6 1.70 1.7 1.80 1.8 1.90 1.9 1.91 as version number ... 28 as version number vCard versions correspond to RDFa specs, published in 2001, 2006, 2010 & 2014. They do NOT correspond to vCard data format specifications. Server switches A simple web / web socket server is available to provide a GUI for ssc, and to support a simple service. If this is used for more than simple tasks, it should be put behind the usual array of good quality services, such as a firewall, a proxy, and so on. It is not designed to be robust. The following switches are available: --server.enable Enable the server (default disabled) --server.accept F,T Accept connections from clients in the address range F to T. If T is omitted, it is F. The default is 127.0.0.1. Non-local address ranges are rejected. May be repeated. --server.address A Serve on this address (default 127.0.0.1). Use * for all addresses. --server.paramters f The certificate parameters can be found in the file f. --server.passfile f The certificate password can be found in the file f. --server.password xxx The certificate password is xxx. This switch is only available in configuration files, and does not work on the command line. --server.port P Serve on this port (default 80, until I think of a better one). --server.private f The certificate private key can be found in the file f. --server.public f The certificate public key can be found in the file f. Shadow switches A shadow is a copy of the site being analysed, with, for example, SSIs resolved, bad content removed, and duplicated content consolidated. --shadow.changed When shadowing a site that has been previously shadowed, only copy/link files that have changed. --shadow.comment Do not delete comments when writing shadow pages. --shadow.copy X Create a shadow directory structure from source HTML files, with errors removed and some things tidied up. X can be: no copy nothing (default) pages write 'fixed' source files, ignore non source files hard set up hard links to non-source files (requires source and shadow directories to be on the same disk) (see below) soft set up soft links to non-source files (see below) all copy non HTML files too dedu copy non HTML files, but deduplicate them, changing links in HTML source as necessary (see below) report report duplicates (no shadowing) ssc cannot convert between versions of HTML, nor between HTML and XHTML. Link options are only available on systems that support filesystem links. --shadow.enable Enable shadowing (set by other shadow options). If shadowing is enabled, but shadow.root is not set, SSC will litter the site source directories with .ndx files. --shadow.file f Write ssc's shadow cache to file f, to accelerate future shadowing of the same content, updated. --shadow.ignore ext When shadowing, ignore files with this extension (may be repeated). --shadow.info Add a comment at or near the top of each shadowed HTML file noting its generation time. --shadow.msg text Insert a comment containing text at the top of each generated page. Note that, if any SSI include file is updated, the comment will appear whether or not the original page has changed. --shadow.root dir Where to write the shadow site. --shadow.space Leave excess/repeated spaces and blank lines in the shadowed files untidily untouched. --shadow.ssi Do NOT resolve Server Side Includes when shadowing, even if --general.ssi is set. --shadow.update Only examine files that have changed since the -u last time ssc ran. This is incompatible with --corpus.file. This requires --shadow.file. Nits of files that have not changed will not be reported again. --shadow.virtual v=d When shadowing virtual directories, output the shadow of virtual directory 'v' to directory 'd'. 'v' must match a directory set up using --site.virtual. Site switches You will probably need to set some of these switches. For example, if your website is www.example.com, then you should say so using the --site.domain switch. --site.domain domain The domain name of the site is 'domain'. This -S domain can be repeated. This is used to identify any URL that is apparently external but is actually internal to the site. --site.extension ext Treat files with this extension as X/HTML -x ext source files. This may be repeated. Files with extension .html are always checked. --site.index file This is the name of the default file in a -i file directory. This can be repeated. This is used when checking internal links. The default default is index.html. --site.root dir This is the root of the website to analyse. ssc -g dir will recursively scan the directory analysing any HTML files it finds. The default is the current directory. --site.virtual v=d The virtual directory 'v' is located in actual -L v=d directory 'd' on the local filesystem. For example: --site.virtual virtual=D:\actual Spell switches These control spell checking. SSC doesn't actually spell check itself, it uses spell checking facilities on the host system, so your results may vary. --spell.accept XXX XXX is a correct spelling of a word (or a list of words) in all languages. --spell.cased Nitpick correctly spelt but wrongly cased words. --spell.check Check text spelling. Uses external spelling checkers, so results will be inconsistent between systems. --spell.dict LANG,DICT Unix only. Associate dictionary DICT with LANG. For example, if the standard English dictionary is en_GB-large: --spell.dict en-GB,en_GB-large (Under Windows, ssc uses the OS dictionaries.) --spell.icu If "no", do not use the ICU libraries at all (they are rather slow). This will increase the inaccuracy and incorrectness of the spell checks. --spell.list FN,LANG The file FN contains a list of valid spellings for language LANG (which may include country info). If LANG is omitted, the valid spellings apply to all languages. For example: --spell.list villages.txt,en-IE --spell.list dorfer.txt,de --spell.list letzstied.txt --spell.path PATH Unix only. Path to spelling executable. Hunspell or a compatible program is expected. If none is specified, ssc will seek hunspell. Under Windows, ssc uses the system spell- checker, if there is one. Stats switches SSC can output lots of statistical information about the site being analysed, although by default it outputs nothing. Use --stats.selected to output a small subset of statistical data, and --stats.all to output everything. Use --stats.summary to output grand totals, and --stats.page to output information on each page read. The other switches allow you to precisely specify what data you want to see. If you want to output the data to a file, use --stats.export. If you select both --stats.page and --stats.all, be prepared for rather a lot of output. --stats.abbr Output abbreviation report, so you can verify the same abbreviations have the same expansions across the site. --stats.all Output all statistics reports. --stats.annotation Output annotation report. --stats.attribute Output element attribute report, which expands the element report to output information about attributes used. --stats.category Output category report, which output the total quantity of nits reported by nit category. --stats.character-variant Output character variant report. --stats.class Output class report, which allows to you see which classes are defined in CSS but not used, which classes are used but not defined, as well as a count of both for all classes encountered. --stats.content-name Output content name report. --stats.counter-style Output counter style report. --stats.css-property Output css property report, which gives you an idea of the sophistication of the CSS used on the site. --stats.custom-media Output custom media report, which lists all named custom media definitions encountered. --stats.custom-property Output custom property report, which lists all named custom property definitions encountered. --stats.definition Output definitions report, so you can verify the same terms have the same definitions across the site. --stats.element Output element report., which totals all elements encountered across the site. --stats.error Output counts of errors, warnings, etc.. --stats.export F Export to file F. --stats.file Output file report, which reports the number of pages processed, and summerises file sizes. --stats.font Output font report, which lists all fonts used across the site. --stats.font-family Output font family report, which lists all font families named across the site. --stats.highlight Output highlight report. --stats.historical-form Output historical font form report. --stats.id Output id report, allowing you to identify which ids are styled but not mentioned. --stats.itemid Output itemid report, which gives you an idea of the ontological significance and depth of the site. --stats.keyframe Output keyframe report, which lists all named keyframes. --stats.layer Output layer report, which lists all named layers. --stats.meta Produce statistics on <META> usage in <HEAD>. Note that pragmas reported (http-equiv) are those found in the HTML source, not those returned by the HTTP protocol. Remember that many web servers (not all) will remove some pragmas when serving pages. --stats.name-value Output name/value pairs report, which helps you identify inconsistencies between definitions across the site. --stats.ontology Output ontology report, which gives an insight into the ontological depth of the site being analysed. --stats.ornament Output ornament report, which reports all named CSS font ornaments encountered. --stats.page Produce statistics for each source file encountered. --stats.page-name Output page name report, which reports all named CSS page-names encountered. --stats.palette Output palette report, which reports all named CSS palettes encountered. --stats.property Output ontology property count report, as an addendum to --stats.ontology. --stats.reference Output reference report, which identifies, as precisely as it can, which versions of HTML, XHTML, CSS, etc., are found. --stats.region Output region report, which reports all CSS named regions encountered. --stats.scroll-anim Output scroll animation report, which reports all CSS named scroll animations encountered. --stats.selected Output a selected set of reports; may be modified by other stats switches. --stats.statement Output CSS statement report, which summarises all CSS statements encountered. --stats.styleset Output styleset report, which reports all CSS named stylesets encountered. --stats.stylistic Output stylistic report, which reports all CSS named stylistics encountered, excluding the band themselves. --stats.summary Produce a summary of overall statistics for the website, including grand totals. --stats.swash Output swash report, which reports all CSS named swashes encountered. --stats.version Output version report, which summarises versions of HTML, SVG, MathML, etc., encountered. --stats.view Output view report, which reports all CSS named views encountered. SVG switch If you want to analyse some SVG that is not comporary to the HTML being analysed, you may find the --svg.version switch useful. --svg.version x Presume any SVG code encountered is this version, unless the SVG code itself specifies a version. Versions recognised: 1.0 1.1 1.2 (really 1.2/tiny) 1.2/tiny 1.2/full (May 2004 draft, incomplete, any conflict with 1.2/tiny always resolves in favour of 1.2/tiny) 2.0 2.1 (April 2021 draft) If this switch is not used, and some SVG code does not identify its version, the version is derived from the version of the host X/HTML code. Validation switches These switches are only useful if you have bespoke HTML and CSS on your website. They allow you to define additional valid values of certain data types. Start with the --validation switch, and go on from there. --validation Only available from the command line. Lists all types that can be given additional valid values. --validation.attribute ATT Add the custom attribute ATT. This attribute will be ignored, not validated. ATT may optionally be a series of comma separated values: name,namespace,flags,flags2 The possible values of flags and flags2 can be understood by looking at the source. --validation.charset CH Accept CH as a charset. May be repeated. --validation.class CL Add the valid class CL. May be repeated. --validation.color COL Accept COL as a colour. May be repeated. --validation.colour COL Accept COL as a colour. May be repeated. --validation.country CC Accept CC as a valid two-letter country code. May be repeated. --validation.currency CUR Accept CUR as a valid currency. May be repeated. --validation.element EL Accept <EL> as a valid element. This element will be ignored, not validated. EL may optionally be a series of comma separated values: name,namespace,flags,flags2 The possible values of flags and flags2 can be understood by looking at the source. May be repeated. --validation.element-attribute EL,ATT Accept the known attribute ATT on the element <EL>. Doesn't work with namespaces (names containing ':'). May be repeated. --validation.extension EXT Accept the extension EXT as a mimetype file extension. May be repeated. --validation.ff FEATURE Accept FEATURE as a CSS font feature. These should normally be four characters long. May be repeated. --validation.ff VARIATION Accept VARIATION as a CSS font variation. These should normally be four characters long. May be repeated. --validation.httpequiv HEQ Accept HEQ as a valid macro for httpequiv on <META> elements. May be repeated. --validation.lang LANG Accept LANG as a valid language code. May be repeated. --validation.minor x When validating W3 HTML 5 source code, using -m x this minor version of W3 HTML 5. Valid values are 0, 1, 2, and 3 (draft). WhatWG versions are determined by date, corresponding roughly to the date of the (online) publication of the specific version. See the --html.version switch. --validation.metaname M Accept M as valid for the NAME attribute of the <META> element. The VALUE will be ignored. May be repeated. --validation.microdata Validate (schema.org) microdata. --validation.mimetype MT Accept MT as a valid mimetype. May be repeated. --validation.sgml SGML Accept SGML as a valid SGML schema identification (as found in <!DOCTYPE ...>). May be repeated. --validation.XXX YYY Accept YYY as a valid value for attribute type XXX. For a list of possible values of XXX, use the command line switch --validation. CONFIGURATION FILE FORMAT If a configuration file is used, it should be in INI file format. All content is optional. Section and option names are derived from the long form switch name, which consists of --SECTION.OPTION, laid out in the format: [SECTION] OPTION= OPTION=123456 Switches that do not have a long form version cannot be used in a configuration file. Each ssc test (in the recipe/toast folder) has a configuration file; browse them for examples. ENVIRONMENT If you set --general.cgi, ssc will check these environment variables: QUERY_STRING Run under OpenBSD's httpd server. See notes below. SSC_CONFIG If no configuration file is given on the command line, use this one SSC_ARGS Preliminary command line parameters If, when SSC is run, the environment variable QUERY_STRING is set to an OpenBSD httpd server CGI value that includes the parameter html.snippet, then SSC will nitpick that snippet only. Some other parameters are processed, including general.verbose and html.version. EXIT STATUS If no significant nits are found, ssc exits with 0, otherwise it exits with a value > 0. See the --general.error switch. OUTPUT TEMPLATE The --nit.format switch allows control of output format. It takes a file name. The format of that text file is a sequence of fixed section names, enclosed in square brackets on their own lines, each optionally followed by text. In that text, certain specific identifiers, enclosed in brace pairs, are substituted. For example: [dog-section] My dog {{dog-name}} is a {{bad-dog}}. For examples, browse recipe/toast/output/*.nit If no file is specified, or if the file cannot be loaded, a default template is used. Note also the --nit.quote switch. EXAMPLES To verify the version of ssc: ssc -V To check the static web side source directory /home/site/wwwroot: ssc /home/site/wwwroot To check a static HTML/XHTML website for example.com, that uses server side includes, in the current directory, with verification of external links, with rather verbose output: ssc -e -I -x html -x shtml -s example.com -v 5 -i index.shtml To check a static web side in the current directory, with a virtual directory, verifying microformats: ssc -L vitual=/home/site/virtual -M To check a static web site using a configuration file: ssc -f config.file A simple configuration file might contain: [general] verbose=4 output=simple.out [site] domain=example.edu extension=html index=index.html root=simple A configuration file to check a site against HTML 5.2 and SVG 1.1 might contain: [general] output=site.out class= [link] check= [site] domain=example.edu extension=html index=index.html root=site [html] version=5.2 [svg] version=1.1 A configuration file to check against a particular WhatWG living standard, gathering statistics: [general] output=jan21.out [html] version=2021/01/01 [link] check= [microdata] version=11.0 [site] domain=example.edu extension=html index=index.html root=site [stats] summary= meta= A configuration file to shadow copy and deduplicate a site might contain: [general] output=dedu.out class= [site] domain=example.edu extension=html index=index.html root=site [shadow] copy=5 root=shadow file=dedu.ndx A configuration file to export microdata preparing against schema.org version 7.2 might contain: [general] output=export.out class= [site] domain=example.edu extension=html index=index.html root=site [link] check= [microdata] export= root=export version=7.2 Example conf files can be found scatted across the test suite, in particular in recipe/toast/conf/other and recipe/toast/conf/sites. PREPARING and UPDATING a SITE These files are based on the steps I take to update an OpenBSD website. Presume a directory containing the following: site.conf ssc configuration file for a website site shadow output produced by ssc Then I run a script like this: ssc -f site.conf upload.sh site /var/www/site-upload server user 0 ssh user@server "cd /var/www ; mv site x ; mv site-upload site ; mv x site-upload ; ln -sf site htdocs" upload.sh is a macos bash script that can be found among the source code. Note that I have rather naughtily replaced OpenBSD's httpd document directory /var/www/htdocs with a link. The conf file can be found at recipe/toast/conf/sites/live.conf. SEE ALSO tidy linkchecker HISTORY ssc (ssc.lu) is written by Dylan Harris (dylanharris.org)
known issues
SSC is α software. It doesn’t do what it’s supposed to do, and what it’s supposed to do is wrong.
- SSC is built based on my understanding of various standards. My understanding is certainly wrong. I will have misread some text, and misunderstood what I read correctly.
- I had to make a number of compromises when building the code. Some checks are incomplete or even entirely missing.
- I put my emphasis on standards that are actively followed, and put little effort, beyond the basics, into those that were never properly implemented.
- I built an evolving product, reflecting evolving standards: biological evolution made the dodo, linguistic evolution made ‘hippopotamus’*, I made ssc.
- The code was built to get something working. In many places, it is horrible. A great deal of refactoring could take place, had I the time.
- No attempt was made to write secure code. It should only be run on trusted data. In particular, a great weakness of much software is the parser, and the parsers in SSC were handmade using hopeless optimism and bizarre ideas.
- The tests are incomplete. Emphasis is placed on HTML 5, but even those tests suffer from missing content.
Note that github hosts a list of known issues.
* How can such a dangerous animal have such a cuddly name? It’s like calling a Hound of Hell ‘Fluffy’, or Death’s horse Binky.
bug reporting
SSC is α software. It may contain unexpected features. If you encounter such a delight, please help improve ssc by collecting the following information (where relevant):
- version of ssc;
- precise version of the operating system;
- hardware architecture and system information;
- detailed description of the error and how to cause it;
- copy of output file showing the error;
- copy of pages/website being analysed;
- precise command used;
- configuration file(s) used, if any;
- any ndx file or other pre-existing file used during the run;
- any known workarounds or solutions;
- a blues or dance interpretation of the 'feature'.
and emailing everything to mail@ssc.lu (if the collected files are more than small, please use a public fileserver and email the link). Do NOT send anything confidential. Furthermore, unless you request otherwise, we reserve the right to publish some or all of the information sent in future versions of ssc, usually in the test suite. If you have a fix, you are invited to submit a pull request on github. Thank you.
build
BUILD NOTES static site checker https://ssc.lu/ (c) 2020-2024 Dylan Harris Introduction ============ SSC can be built on various unii with CMake and clang or gcc for C++ 17 or better, or Visual Studios 2017 / 2019 / 2022 under Windows. I have built & tested it in various OSs on some amd64 & arm64 architectures. Although ssc builds with older compilers on some older systems, not all features are available. Libraries ========= Common dependencies ------------------- ssc needs boost version 1.75 or better (https://boost.org), a recent copy of the ICU libraries (https://icu-project.org/) (or define NOICU). Microsoft's GSL (https://github.com/Microsoft/GSL) (or define NO_GSL), and a recent version of libcurl (https://curl.se/)* (or define NOCURL). If you want to experiment with the still-in- development GUI version, you'll also need a recent version of wX. Usually, an Operating System's package manager has appropriate versions ready to install. You may need to set these environment variables: - BOOST: if you're not using your operating system's packaged flavour of boost, then set BOOST to your boost source root directory; - CURL: if you're not using your operating system's packaged flavour of curl, then set CURL to your curl source root directory; - GSL: set it to your GSL root directory. - ICU_ROOT: if you're not using your operating system's packaged ICU, set ICU_ROOT to your ICU source root directory; - WX_ROOT: if you're building the gui front end to ssc, you'll need to install wxWidgets and set WX_ROOT to its installation directory. *libcurl requires a thread-safe underlying SSL library: see https://curl.se/libcurl/c/threadsafe.html. Note that the Windows solutions no longer require vcpkg; it proved too unreliable. However, if you wish to use it, go ahead: you may have better luck than me. hunspell -------- Building SSC under unii, including macos, requires a development installation of hunspell (https://hunspell.github.io/). winspell -------- The Windows build, by default, uses the native Windows spellchecker, although, preceding Windows 11, that doesn't seem to work so well in contexts unimpaired by monolingualism. Notes on the GUI ================ wxWidgets --------- Why use this ancient behemoth given the good number of somewhat less archaic C++ GUI libraries? The requirements were: (i) Open Source; (ii) supports Windows/MacOS/Linux/OpenBSD. Of those libraries I found, only wxWidgets was documented to support OpenBSD. Polylingualism -------------- ssc is written for coders. HTML/etc. code is based on English, so ssc's GUI text is similarly monolingual. Unstable -------- The GUI will evolve rapidly over the coming few months, so expect it to change significantly. Building ======== Windows ------- To build from Visual Studio, navigate to recipe/tea, open the appropriate .sln file, then build. Only Visual Studios 2017, 2019 and 2022 have been built & tested, for amd64 (x64) and arm64 (M2), under Windows 10 & 11. On low memory machines, disable the /MP switch. The Visual Studio solutions use vcpkg, which resolves dependencies. For all versions, except recent editions of Visual Studio 2022, you may need to first download and install vcpkg yourself, from https://vcpkg.io/. Unii & mock Unii ---------------- You will need CMake 3.19 or better. On Linux, you will also need lsb-release. These can be found in most distributions' standard packages. For macos, I used macports, but I expect brew is good too. From the home ssc directory, compile a normal build thus: cmake . make ctest make install For a debug build: cmake -DCMAKE_BUILD_TYPE=Debug . make ctest make install If everything works correctly, then everything will be built, a series of tests run, with a final result at the very end saying no failures. Having said that, given SSC is alpha, don't be too surprised to see some warnings or some final test errors. Note in particular that complaints about being unable to find or copy files during testing are not of concern, these come from scripts that set up or tear down individual tests, and the standard commands used sometimes complain if they can't find files they're supposed to delete, which is a bit silly given that means things are already in the required state. ssc has been successfully built in OpenBSD, FreeBSD, Linux & MacOS on AMD64, and in recent versions of Linux & MacOS under ARM64. The current version of ssc requires the current version of an operating system. Older operating systems require older versions of ssc. Not all features work on all systems. I've sometimes found it necessary to use cmake's -DCMAKE_CXX_COMPILER=... switch. Centos 9 -------- The appropriate CMake command is: cmake . -DFLAVOUR=CentosOSStream -DFLAVOUR_VER=9 (note the standard English spelling of flavour.) OpenBSD ------- You may need to increase significantly the available memory setting for your build account in login.conf. Macos ----- Certain versions of macos clang produce buggy code, whether or not optimisations are applied. Use an alternative compiler if you want a stable executable. I accept a bug could be in ssc code, but I've not found it. Testing ======= Windows ------- Under Visual Studio, run ssc??-test using these arguments: -v -x $(ProjectDir)..\..\ssc.exe -f $(ProjectDir)..\toast\ssc-test\win.lst (on one line) Add '-d' if you want the test utility to retain temporary files. CMake ----- Under CMake, run ctest: ctest -V (which runs ssc-test for you, using nix.lst). Dimitude -------- The testing utility is rather dim; it will test unbuilt features, causing failures. Spelling test results depend on the dictionaries installed. Supporting libraries ==================== GSL --- If you can't find a copy of Microsoft's GSL in your system's standard package suite, then grab a current copy from its github repository (https://github.com/Microsoft/GSL), then unpack, build and install it. In Windows, remember to add its root directory to your local path. Boost ----- Boost is to C++ as breakfast to the working day. Most package managers support it, including vcpkg. Alternatively, build your own version using the source found at boost.org. Curl ---- curl is used for link checking and, where necessary, obtaining remote resources. Most package managers support it. wxWidgets --------- This is only required if you make a GUI build, which is not recommended (yet). It can be found at wxwidgets.org. Note that I have not tested it under many supported (by wxWidgets) systems. Editions ======== Currently, there are new, work-in-progress, server and gui editions in the Visual Studio solution. These do not work and should not be used. Stick to the standard edition.
notes
If everything works correctly, then everything will be built, a series of tests run, with a final result at the very end saying no failures. Having said that, given SSC is α, don’t be too surprised to see some warnings or some final test errors. Worse, some tests have dependencies that vary across systems, which can cause spurious test failures.
source
0.2.4
- OpenBSD 7.6
- remove Windows build dependency on vcpkg
- added nooopener-allow-popups
- offer a simple shell
- various bug fixes
- underlying work
- download source
- released
0.2.3
- Living Standard October 2024
- schema.org version 28
- DPV ontologies to v2.0
- macos sequoia
- various bug fixes
- underlying work
- download source
- released
0.2.2
- schema.org version 27.0.2
- most DPV ontologies to v0.7
- CSS Viewport Module Level 1, January 2024 draft
- note edition in version info
- various bug fixes
- underlying work
- download source
- released
0.2.1
- Living Standard July 2024
- CSS 2024 Snapshot
- schema.org version 27.01
- “4. Safe to Release pre–CR Exceptions” (CSS Snapshots, 2018 onwards)
- various bug fixes
- underlying work
- download source
- released
0.2.0
- Living Standard April 2024
- schema.org 27.0
- OpenBSD 7.5
- ICANN .internal recommendation
- various bug fixes
- underlying work
- download source
- released
0.1.60
- add rel="expect"
- underlying work
- download source
- released
0.1.59
- Withdraw support for Macos High Sierra and earlier
- Withdraw support for OpenBSD 6.8
- various CMake improvements
- underlying work
- download source
- released
0.1.58
- recognise ICANN’s .internal
- various CMake improvements
- underlying work
- download source
- released
0.1.57
- add CMake uninstall target
- download source
- released
0.1.56
- fix NO_GSL build
- various bug fixes
- underlying work
- download source
- released
0.1.55
- schema.org 26.0
- various bug fixes
- underlying work
- download source
- released
0.1.54
- prepare macports submission
- submit openbsd port (for 0.1.53)
- various bug fixes
- underlying work
- download source
- released
0.1.53
- schema.org 25.0
- Visual Studios 2017/2019 solutions now use vcpkg
- various bug fixes
- underlying work
- download source
- released
0.1.52
- Fix Macos build bug introduced in 0.1.51
- more dynamic loads
- underlying work
- download source
- released
0.1.51
- USE 0.1.52; this has a macos build error
- most libraries now dynamic load
- updated CSS 2023 Snapshot to December 2023 specification
- underlying work
- download source
- released
0.1.50
- Schema.org 24.0
- Living Standard Jan 2024
- VS 2022 now uses vcpkg
- recategorised some switches
- default thread count now 1
- address more Apple clang bugs
- underlying work
- download source
- released
0.1.49
- updated stats reports
- microdata switch section renamed to ontology
- various bug fixes
- underlying work
- download source
- released
0.1.48
- Freebsd 12.4 and 13.2
- renamed --general.slob as --general.sloven
- various bug fixes
- underlying work
- download source
- released
0.1.47
- amended --general.vcs filename list
- various bug fixes
- underlying work
- download source
- released
0.1.46
- NOTE 0.1.46 has a type bug; prefer 0.1.47
- CSS Box Sizing 4
- CSS Text Overflow 4
- CSS Spatial Navigation 3
- added -A switch
- --general.exclude now uses simple matching
- --link.pretend now uses simple matching
- generalised --general.git as --general.vcs
- various bug fixes
- underlying work
- download source
- released
0.1.45
- CSS Marquee
- restore VS 2017 build
- OpenBSD package ready
- various bug fixes
- underlying work
- download source
- released
0.1.44
- CSS Generated Content 3
- CSS Region 3
- CSS Scoping 3
- CSS Scroll-Driven Animations 3
- CSS View Transitions 3
- improved integration of SVG and CSS
- various bug fixes
- underlying work
- download source
- released
0.1.43
- OpenBSD 7.4
- Macos Sonoma
- CSS Custom Highlights 3
- CSS Non-Element Selectors 3
- CSS Paged Media 3
- CSS Shadow Parts 3
- various bug fixes
- underlying work
- download source
- released
0.1.42
- schema.org 23.0
- CSS Overscroll 3
- CSS Page Floats 3
- CSS Presentation Levels 3
- CSS Pseudo-Elements 4
- CSS Rhythmic Sizing 3
- CSS Round Display 3
- CSS Ruby Annotations 3
- CSS Scrollbar Anchor 3
- various bug fixes
- underlying work
- download source
- released
0.1.41
- all CSS snapshots
- CSS Conditional Rules 3
- CSS Conditional Rules 4
- CSS Conditional Rules 5
- CSS Lists 3
- CSS Text 3
- CSS Text 4
- various bug fixes
- underlying work
- download source
- released
0.1.40
- CSS Colour Adjustment 3
- CSS Device Adaption 3
- CSS Exclusions 3
- CSS Inline Layout 3
- CSS Line Grid 3
- CSS Logical Properties 3
- CSS Scrollbar Style 3
- underlying work
- download source
- released
0.1.39
- CSS Contain 5
- CSS Contain 4
- CSS Contain 3
- CSS Filter 3
- CSS Scroll Snap 3
- various bug fixes
- underlying work
- download source
- released
0.1.38
- CSS 2015 Snapshot
- CSS Mobile Profile 2
- CSS Print Profile
- CSS TV Profile
- CSS Will Change 3
- drop VS 2017
- underlying work
- download source
- released
0.1.37
- CSS Images 4
- CSS Masking 3
- CSS Transforms 3
- CSS Transforms 4
- various bug fixes
- underlying work
- download source
- released
0.1.36
- Living Standard October 2023
- CSS Images 3
- --html.ie ignores certain old internet explorer naughtitudes
- --html.safari ignores certain old safari naughtitudes
- underlying work
- download source
- released
0.1.35
- CSS Speech 3
- CSS Text Decoration 3
- CSS Text Decoration 4
- underlying work
- download source
- released
0.1.34
- Fabio 2.1 ontology
- Prism ontologies, partial
- Adobe ontologies, partial
- Exif ontologies, partial
- various bug fixes
- underlying work
- download source
- released
0.1.33
- ADMS 1.0 and 2.0 ontologies
- BFO 2.0 and 2020 ontologies
- disco (ddi ontology)
- various bug fixes
- underlying work
- download source
- released
0.1.32
- CSS Grid 3 (partial)
- CSS Grid 4 (partial)
- CSS Shapes 3
- various bug fixes
- underlying work
- download source
- released
0.1.31
- CSS Writing Mode 3
- CSS Writing Mode 4
- biro ontology
- cito ontology
- various bug fixes
- underlying work
- download source
- released
0.1.30
- CSS Box Alignment 3
- CSS Box Model 3
- CSS Box Model 4
- CSS Display 3
- CSS Multi-Column 3
- CSS Overflow 3
- CSS Positions 3
- CSS Transitions 3
- X-Clacks-Overhead
- various bug fixes
- underlying work
- download source
- released
0.1.29
- CSS Counter Style 3
- CSS Flexible Box Layout 3
- underlying work
- download source
- released
0.1.28
- CSS Font 3
- CSS Font 4 (December 2021 draft)
- CSS Font 5 (December 2021 draft)
- underlying work
- download source
- released
0.1.27
- CSS Compositing
- CSS Fragmentation 3
- CSS Fragmentation 4 (December 2018 draft)
- schema.org 22
- various bug fixes
- underlying work
- download source
- released
0.1.26
- CSS Backgrounds and Borders (Feb 2023 version)
- CSS Colour 5
- CSS Easing Functions
- further improve build time
- various bug fixes
- underlying work
- download source
- released
0.1.25
- Living Standard July 2023
- output nit references, where applicable
- various bug fixes
- underlying work
- download source
- released
0.1.24
- improved build time
- underlying work
- download source
- released
0.1.23
- CSS Animation 3 (called 1, but not part of CSS 1 spec)
- CSS Animation 4 (called 2, but not part of any CSS 2 spec)
- CSS Values 4
- various bug fixes
- underlying work
- download source
- released
0.1.22
- linux ARM64 (centos 9 tested)
- schema.org default now version 21
- CSS Values 3
- underlying work
- download source
- released
0.1.21
- schema.org updated (default still 15.0)
- ARM64 builds for macos and windows
- underlying work
- download source
- released
- OpenBSD 7.3
- underlying work
- download source
- released
0.1.19
- CSS 3 Cascade
- CSS 4 Cascade
- CSS 5 Cascade
- CSS 6 Cascade (March 2023 draft)
- CSS 4 Colour
- CSS Custom Properties
- CSS 4 Selectors (November 2022 draft)
- CSS 3 Syntax
- Abhorrent nits for markup that disgusts
- underlying work
- download source
- released
0.1.18
- WhatWG Living Standard (April 2023)
- CSS 5 Media
- CSS 4 Media
- underlying work
- download source
- released
0.1.17
- CSS 3 Basic User Interface
- CSS 4 Basic User Interface (March 2021 draft)
- CSS 3 Colour
- CSS 3 Media
- underlying work
- download source
- released
0.1.16
- CSS STYLE attribute
- CSS 3 accessibility
- underlying work
- download source
- released
0.1.15
- CSS 3 Namespaces
- CSS 3 Selectors
- underlying work
- download source
- released
0.1.14
- underlying work
- download source
- released
0.1.13
- Living Standard for January 2023
- CSS 2.0 verification
- CSS 2.1 verification
- CSS 2.2 (February 2022 draft) verification
- underlying work
- download source
- released
0.1.12
- NOTE: many tests fail in this version; they need CSS 3 which isn’t there yet
- CSS 1 verification
- expanded stats for CSS
- output id-* renamed as itemid-*, class-* generalised to tally-*
- macOS Ventura
- OpenBSD 7.2
- underlying work
- download source
- released
0.1.11
- added --link.pretend REGEX which pretends the link regex exists locally
- tighten poetry ontology
- various bug fixes
- download source
- released
0.1.10
- schema.org 15.0 (beta)
- download source
- released
0.1.9
- Living Standard for October 2022
- poetry ontology 1.1
- various reliability improvements
- download source
- released
0.1.8
- unknown class names are retained when shadowing
- various reliability improvements
- download source
- released
0.1.7
- improve memory footprint
- underlying work
- download source
- released
0.1.6
- partial MathML 4 core (21 July 2022 draft)
- backport to Yosemite
- underlying work
- download source
- released
0.1.5
- restore Visual Studio 2017 solution
- backport to older versions of macos
- underlying work
- download source
- released
0.1.4
- bug fix release
- download source
- released
0.1.3
- dropped date library & boost process (both use system ())
- replaced curl executable with curl library
- drop visual studio 2017 solution
- more linux flavours tested
- builds under freebsd 12.3 & 13.1
- download source
- released
0.1.2
- trial OpenBSD port
- Linux builds require lsb_release
- underlying work
- download source
- released
0.1.1
- avoid clang clanger
- underlying work
- download source
- released
0.1.0
- the very first α release!
- underlying work
- download source
- released
0.0.134
- further performance improvements
- underlying work
- download source
- released
0.0.133
- further performance improvements
- underlying work
- download source
- released
0.0.132
- further performance improvements
- underlying work
- download source
- released
0.0.131
- living standard for July 2022
- underlying work
- download source
- released
0.0.130
- performance improvement with multithreading
- can exclude files from processing, such as git files
- can list all classes used, not just those defined in CSS files
- uses Howard Hinnant’s date library
- fix schema hierarchy property type checks
- underlying work
- download source
- released
0.0.129
- improve diagnostics of invalid HTTP-EQUIV entries
- underlying work
- download source
- released
0.0.128
- Moved CMakeLists.txt to project root to appease certain packagers
- process schema.org versions before v2.0
- fixed some ontology verification bugs
- underlying work
- download source
- released
0.0.127
- all versions of the gs1 ontology now recognised
- various refinements & bug fixes
- download source
- released
0.0.126
- Living Standard April 2022
- various refinements & bug fixes
- download source
- released
0.0.125
- schema.org 14.0
- mark schema.org attic content as deprecated
- gs1 microdata now recognised
- make default protocol https (it was http)
- removed webmention code
- various refinements & bug fixes
- download source
- released
0.0.124
- builds under Centos 9 streams
- report unexpected content in configuration file
- use a binary switch to select option, & a new ‘no’ switch to deselect it
- documented certain previously hidden switches
- various refinements & bug fixes
- download source
- released
0.0.123
- Improved spell checks with ICU libraries (use --spell.icu to disable them)
- added --link.example, --link.local and --link.report
- documented certain previously hidden switches
- various refinements & bug fixes
- download source
- released
0.0.122
- Added spelling checks & spell.xxx switches (requires hunspell on unix)
- Changed behaviour of binary switches (args are now processed, not presumed)
- A number of features are enabled by default
- underlying work / various refinements
- download source
- released
0.0.121
- Living Standard Jan 2022 (very similar to October 2021)
- Drop 32 bit builds & macos before catalina
- underlying work
- download source
- released
0.0.120
- RDFa
- added --ontology.list to list known ontology schema
- added --ontology.ONT x.y to set the default version of ontology ONT
- download source
- released
0.0.119
- macos Monterey
- underlying work / various refinements
- download source
- released
0.0.118
- Visual Studio 2022 solution
- underlying work / various refinements
- download source
- released
0.0.117
- underlying work / various refinements
- download source
- released
0.0.116
- change default HTML to living standard Oct 2021
- OpenBSD 6.9 / 7.0
- underlying work / various refinements
- download source
- released
0.0.115
- set unii installation directory to ~/bin
- added experimental solution for Visual Studio 2022 preview
- XHTML role attribute (https://www.w3.org/TR/xhtml-role/)
- now requires boost 1.75 or better
- unii builds now require CMake 3.12 or better
- underlying work
- download source
- released
0.0.114
- underlying work
- download source
- released
0.0.113
- RDFa with schema.org, but otherwise no core initial context (yet)
- restore progress report via -D switch
- underlying work
- download source
- released
0.0.112
- control output format
- underlying work
- download source
- released
0.0.111
- living standard july 2021
- schema.org v 13.0
- added --shadow.enable
- drop Visual Studio 2015
- download source
- released
0.0.110
- underlying work
- download source
- released
0.0.109
- update flag, so SSC only looks at files that have changed recently
- download source
- released
0.0.108
- default version of HTML 5 switched to W3’s HTML 5.2.
- added example website update script
- specify which page content goes in the corpus
- download source
- released
0.0.107
- SVG 1.2/Tiny
-
partial SVG 1.2/Full (May 2004 draft):
- conflicts with 1.2/Tiny always resolved in favour of 1.2/Tiny
- extensions parsed but not processed
- not complete, nor will it ever be
-
SVG 2.0 (August 2018) with:
- December 2018 Filter Effects
- April 2021 Animations draft
-
SVG 2.0 (April 2021 draft) (2.1 to be?) with:
- October 2019 Filter Effects
- April 2021 Animations draft
- download source
- released
0.0.106
- underlying work
- download source
- released
0.0.105
- improved diagnostics on abort
- underlying work
- download source
- released
0.0.104
- underlying work
- download source
- released
0.0.103
- underlying work
- download source
- released
0.0.102
- underlying work
- download source
- released
0.0.101
- MathML 4, Dec 2020 draft (it’s early days; MathML 4 is really MathML 3 with post–it notes)
- can run in the OpenBSD 6.8 httpd server CGI environment (do NOT expose SSC to untrusted data sources, such as those on the open web, without taking serious precautions: SSC is α software, and probably has more bugs than the Creator’s Ultimate All–Beetle Extravaganza)
- --shadow.changed: only update files in the shadow directory when the originals have changed
- expanded ligature suggestions now work across systems
- recognise the non–standard character codes &bang; &hash; &splat; &squiggle; (! # * ~)
- improvements to corpus data extraction
- download source
- released
0.0.100
- can nitpick against WhatWG Living Standard April 2021 (except MathML 4 & SVG 2)
- expanded character code suggestions, particularly for ligatures (Windows only)
- improved aria attribute verification
- the environment variable SSC_CONFIG can specify a configuration file
- the environment variable SSC_ARGS can specify command line arguments
- specify custom elements and custom attributes (see recipe/toast/type/custom/* for example)
- dump site corpus with -d switch
- download source
- released
0.0.99
- checks microformats in microdata
- export living standard & microformats microdata
- download source
- released
0.0.98
- living standard microdata ITEMTYPEs processed
- stats now only counts reported errors
- download source
- released
0.0.97
- living standard jan 2005 – jan 2021, mostly
- MathML 4 and SVG 2 are not currently understood
- various microdata, including vcard, vevent, purl.org and n.whatwg.org, are not currently understood
- no spellchecker
- download source
- released
0.0.96
- underlying work
- download source
- released
0.0.95
- underlying work
- download source
- released
0.0.94
- <INPUT> PATTERN checks
- improved diagnosis output
- download source
- released
0.0.93
- processes schema.org 12.0 microdata
- download source
- released
0.0.92
- recognise open graph meta names
- expand mime type checking
- download source
- released
0.0.91
- underlying work
- download source
- released
0.0.90
- underlying work
- download source
- released
0.0.89
- more media type / file extension checks
- underlying work
- download source
- released
0.0.88
- added media type checks
- underlying work
- download source
- released
0.0.87
- underlying work
- download source
- released
0.0.86
- underlying work
- download source
- released
0.0.85
- additional stats options, reporting <DT><DD>, <ABBR>>, & <DFN> content
- download source
- released
0.0.84
- underlying work
- download source
- released
0.0.83
- verifies various new living standard referenced http-equiv pragmas
- download source
- released
0.0.82
- --schema.version now accepts + for HTML+
- download source
- released
0.0.81
- --schema.version now accepts x.y style versions
- --schema.minor removed
- download source
- released
0.0.80
- adds a (prototype) man page (recipe/tea/gen.txt)
- adds --stats.meta to generate stats on <META> usage in <HEAD>
- checks content-security-policy values
- download source
- released
0.0.79
- A new -z switch to specify the maximum preferred length of title text;
- download source
- released
0.0.78
- underlying work
- download source
- released
0.0.77
- checks that the HTML page and the charset declared on it (if any) have something in common
- download source
- released
0.0.76
- added --shadow.ignore to ignore files with specified extension
- download source
- released
0.0.75
- added --microdata.root and --microdata.virtual for microdata exports
- Ubuntu Server 20.10 amd64 build
- default dedu cache now based on config file name
- underlying work
- download source
- released
0.0.74
- can process schema.org 11.0 microdata;
- includes some microdata refinements.
- download source
- released
0.0.73
- Export ‘repaired’ HTML files, including processing of Server Side Include directives;
- Deduplicate non–HTML files. When used with export, it copies one version of the file and modifies links appropriately.
- download source
- released
0.0.71
- download source
- released
0.0.70
- download source
- released
0.0.60
- download source
- released
0.0.55
- download source
- released
0.0.2
- download source
- released
boot notes
Notes on folder names:
- recipe: a nod to Vernor Vinge’s “A Fire Upon the Deep”
- tea: without tea, nothing works; then there’s builders’ tea
- sauce: makes something dull quite delicious; identifies the arrogant; &, anyway, it’s obvious
- toast: toasts code; i like burnt toast
- heater: i’m not stopping now
These reference documents are hoovered from various open source sites. They’re collected here for convenience; at all times the originals are correct. The subjects are: aria, activity streams, bibo, creative commons, charsets, content (RDF), content security policy, cascading style sheets, csvw, common tag, dataset quality, dbpedia, dublin core, data catalogue, did, document object model, domain, data quality, data usage, earl, ebu, fibo, foaf, good relations, grddl, HTML 1, HTML 2, HTML 3, HTML 4, HTTP, ical, its, javascript, json, lang, link relations, locn, ma-ont, marinetlo, mathML, media capture, microdata, mime, music, ns, web annotation, odrl, open graph, ontologies, openmath, org, other, owl, p3p, powder, prov, pso, qb, rddl, RDFa, RDFa, rif, schema.org, sd, sioc, skos, sm, smil, smpte, sosa, ssn, svg, time, ttml, url, vann, vcard, void, W3, webgl, webmention, whatwg, XHTML, xhv, XML, xsd, xsl, XSLT.
copyright & licence
Any dispute shall be resolved in accordance with the law of the Grand Duchy of Luxembourg.
SSC SSC, static site checker, https://ssc.lu/ copyright (c) 2020-2024 dylan harris This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA W3 Some test files come from w3.org (some directly, in W3 documents, etc.), and are licensed as follows: License By obtaining and/or copying this work, you (the licensee) agree that you have read, understood, and will comply with the following terms and conditions. Permission to copy, modify, and distribute this work, with or without modification, for any purpose and without fee or royalty is hereby granted, provided that you include the following on ALL copies of the work or portions thereof, including modifications: The full text of this NOTICE in a location viewable to users of the redistributed or derivative work. Any pre-existing intellectual property disclaimers, notices, or terms and conditions. If none exist, the W3C Software and Document Short Notice should be included. Notice of any changes or modifications, through a copyright statement on the new code or document such as "This software or document includes material copied from or derived from [title and URI of the W3C document]. Copyright © [YEAR] W3CÆ (MIT, ERCIM, Keio, Beihang)." Disclaimers THIS WORK IS PROVIDED "AS IS," AND COPYRIGHT HOLDERS MAKE NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE OR DOCUMENT WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. COPYRIGHT HOLDERS WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THE SOFTWARE OR DOCUMENT. The name and trademarks of copyright holders may NOT be used in advertising or publicity pertaining to the work without specific, written prior permission. Title to copyright in this work will at all times remain with copyright holders. Notes This version: http://www.w3.org/Consortium/Legal/2015/copyright-software-and-document Previous version: http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231 This version makes clear that the license is applicable to both software and text, by changing the name and substituting "work" for instances of "software and its documentation." It moves "notice of changes or modifications to the files" to the copyright notice, to make clear that the license is compatible with other liberal licenses. WhatWG Some test files come from whatwg.org (some directly, in WhatWG documents, etc.), and are licensed under a Creative Commons Attribution 4.0 International License. See https://whatwg.org/ for details. corruptpress.com Some test files are derived from pages at corruptpress.com. They are licensed under a Creative Commons Attribution 4.0 International License. Browse https://corruptpress.com/ for details. dylanharris.org Some test files are derived from pages at https://dylanharris.org/. They are licensed under a Creative Commons Attribution 4.0 International License. Browse https://dylanharris.org/ for details.