NAME

checklink - check the validity of links in an HTML or XHTML document


SYNOPSIS

checklink [ options ] URI ...


DESCRIPTION

This manual page documents briefly the checklink command.

checklink is a program that reads an HTML or XHTML document, extracts a list of anchors and lists and checks that no anchor is defined twice and that all the links are dereferenceable, including the fragments. It warns about HTTP redirects, including directory redirects, and can check recursively a part of a web site.

The program can be used either as a command-line version or as a CGI script.


OPTIONS

This program follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below.

-?, --help
Show summary of options.

-V, --version
Output version information.

-s, --summary
Result summary only.

-b, --broken
Show only the broken links, not the redirects.

-e, --directory
Hide directory redirects - e.g. http://www.w3.org/TR -> http://www.w3.org/TR/.

-r, --recursive
Check the documents linked from the first one.

-D, --depth n
Check the documents linked from the first one to depth n (implies --recursive).

-l, --location uri
Scope of the documents checked in recursive mode. By default, for http://www.w3.org/TR/html4/Overview.html for example, it would be http://www.w3.org/TR/html4/.

-n, --noacclanguage
Do not send an Accept-Language header.

-L, --languages
Languages accepted (default: '*').

-q, --quiet
No output if no errors are found.

-v, --verbose
Verbose mode.

-i, --indicator
Show progress while parsing.

-u, --user username
Specify a username for authentication.

-p, --password password
Specify a password.

--hide-same-realm
Hide 401's that are in the same realm as the document checked.

-t, --timeout value
Timeout for the HTTP requests.

-d, --domain domain
Regular expression describing the domain to which the authentication information will be sent. The default value can be specified in the checklink configuration file.

--masquerade local remote
Masquerade local dir as a remote URI (e.g. /home/hugo/MathML2/ is in fact http://www.w3.org/TR/MathML2/).

-y, --proxy proxy
Specify an HTTP proxy server.

-h, --html
HTML output.


FILES

/etc/w3c/checklink.conf
The main configuration file. You can use the W3C_CHECKLINK_CFG environment variable to override the default location.

Trusted specifies a regular expression for matching trusted domains (ie. domains where HTTP basic authentication, if any, will be sent). For example, the following configures only the w3.org domain as trusted:

    Trusted = \.w3\.org$

Allow_Private_IPs is a boolean flag indicating whether checking links on non-public IP addresses is allowed. The default is true in command line mode and false when run as a CGI script. For example, to disallow checking non-public IP addresses, regardless of the mode, use:

   Allow_Private_IPs = 0


ENVIRONMENT

checklink uses the libwww-perl library which has a number of environment variables affecting its behaviour. See SEE ALSO for some pointers.

W3C_CHECKLINK_CFG
If set, overrides the path to the configuration file.


SEE ALSO

The documentation for this program is available on the web at http://www.w3.org/2000/07/checklink.

LWP(3), the Net::FTP(3) manpage, the Net::NNTP(3) manpage, the Net::IP manpage.


AUTHOR

This program was originally written by Hugo Haas <hugo@w3.org>, based on Renaud Bruyeron's checklink.pl. It has been enhanced by Ville Skyttä and many other volunteers since. Use the <www-validator@w3.org> mailing list for feedback, see http://validator.w3.org/docs/checklink.html#csb for more information.

This manual page was written by Frédéric Schütz <schutz@mathgen.ch>, for the Debian GNU/Linux system (but may be used by others).


COPYRIGHT

This program is licensed under the W3C® Software License, http://www.w3.org/Consortium/Legal/copyright-software.