12 Links

Contents

  1. Introduction to links and anchors
    1. Visiting a linked resource
    2. Other link relationships
    3. Specifying anchors and links
    4. Link titles
    5. Internationalization and links
  2. The A element
    1. Syntax of anchor names
    2. Nested links are illegal
    3. Anchors with the id attribute
    4. Unavailable and unidentifiable resources
  3. Document relationships: the LINK element
    1. Forward and reverse links
    2. Links and external style sheets
    3. Links and search engines
  4. Path information: the BASE element
    1. Resolving relative URIs

12.1 Introduction to links and anchors

HTML offers many of the conventional publishing idioms for rich text and structured documents, but what separates it from most other markup languages is its features for hypertext and interactive documents. This section introduces the link (or hyperlink, or Web link), the basic hypertext construct. A link is a connection from one Web resource to another. Although a simple concept, the link has been one of the primary forces driving the success of the Web.

A link has two ends -- called anchors -- and a direction. The link starts at the "source" anchor and points to the "destination" anchor, which may be any Web resource (e.g., an image, a video clip, a sound bite, a program, an HTML document, an element within an HTML document, etc.).

12.1.1 Visiting a linked resource

The default behavior associated with a link is the retrieval of another Web resource. This behavior is commonly and implicitly obtained by selecting the link (e.g., by clicking, through keyboard input, etc.).

The following HTML excerpt contains two links, one whose destination anchor is an HTML document named "chapter2.htm" and the other whose destination anchor is a GIF image in the file "forest.gif":

<BODY>
...some text...
<P>You'll find a lot more in  <A href="chapter2.htm">chapter two</A>. 
See also this <A href="../images/forest.gif">map of the enchanted forest.</A>
</BODY>

By activating these links (by clicking with the mouse, through keyboard input, voice commands, etc.), users may visit these resources. Note that the href attribute in each source anchor specifies the address of the destination anchor with a URI.

The destination anchor of a link may be an element within an HTML document. The destination anchor must be given an anchor name and any URI addressing this anchor must include the name as its fragment identifier.

Destination anchors in HTML documents may be specified either by the A element (naming it with the name attribute), or by any other element (naming with the id attribute).

Thus, for example, an author might create a table of contents whose entries link to header elements H2, H3, etc., in the same document. Using the A element to create destination anchors, we would write:

<H1>Table of Contents</H1>
<P><A href="#section1">Introduction</A><BR>
<A href="#section2">Some background</A><BR>
<A href="#section2.1">On a more personal note</A><BR>
...the rest of the table of contents...
...the document body...
<H2><A name="section1">Introduction</A></H2>
...section 1...
<H2><A name="section2">Some background</A></H2>
...section 2...
<H3><A name="section2.1">On a more personal note</A></H3>
...section 2.1...

We may achieve the same effect by making the header elements themselves the anchors:

<H1>Table of Contents</H1>
<P><A href="#section1">Introduction</A><BR>
<A href="#section2">Some background</A><BR>
<A href="#section2.1">On a more personal note</A><BR>
...the rest of the table of contents...
...the document body...
<H2 id="section1">Introduction</H2>
...section 1...
<H2 id="section2">Some background</H2>
...section 2...
<H3 id="section2.1">On a more personal note</H3>
...section 2.1...

12.1.2 Other link relationships

By far the most common use of a link is to retrieve another Web resource, as illustrated in the previous examples. However, authors may insert links in their documents that express other relationships between resources than simply "activate this link to visit that related resource". Links that express other types of relationships have one or more link types specified in their source anchor.

The roles of a link defined by A or LINK are specified via the rel and rev attributes.

For instance, links defined by the LINK element may describe the position of a document within a series of documents. In the following excerpt, links within the document entitled "Chapter 5" point to the previous and next chapters:

<HEAD>
...other head information...
<TITLE>Chapter 5</TITLE>
<LINK rel="prev" href="chapter4.htm">
<LINK rel="next" href="chapter6.htm">
</HEAD>

The link type of the first link is "prev" and that of the second is "next" (two of several recognized link types). Links specified by LINK are not rendered with the document's contents, although user agents may render them in other ways (e.g., as navigation tools).

Even if they are not used for navigation, these links may be interpreted in interesting ways. For example, a user agent that prints a series of HTML documents as a single document may use this link information as the basis of forming a coherent linear document. Further information is given below of using links for the benefit of search engines

12.1.3 Specifying anchors and links

Although several HTML elements and attributes create links to other resources (e.g., the IMG element, the FORM element, etc.), this chapter discusses links and anchors created by the LINK and A elements. The LINK element may only appear in the head of a document. The A element may only appear in the body.

When the A element's href attribute is set, the element defines a source anchor for a link that may be activated by the user to retrieve a Web resource. The source anchor is the location of the A instance and the destination anchor is the Web resource.

The retrieved resource may be handled by the user agent in several ways: by opening a new HTML document in the same user agent window, opening a new HTML document in a different window, starting a new program to handle the resource, etc. Since the A element has content (text, images, etc.), user agents may render this content in such a way as to indicate the presence of a link (e.g., by underlining the content).

When the name or id attributes of the A element are set, the element defines an anchor that may be the destination of other links.

Authors may set the name and href attributes simultaneously in the same A instance.

The LINK element defines a relationship between the current document and another resource. Although LINK has no content, the relationships it defines may be rendered by some user agents.

12.1.4 Link titles

The title attribute may be set for both A and LINK to add information about the nature of a link. This information may be spoken by a user agent, rendered as a tool tip, cause a change in cursor image, etc.

Thus, we may augment a previous example by supplying a title for each link:

<BODY>
...some text...
<P>You'll find a lot more in <A href="chapter2.htm"
       title="Go to chapter two">chapter two</A>.
<A href="./chapter2.htm"
       title="Get chapter two.">chapter two</A>. 
See also this <A href="../images/forest.gif"
       title="GIF image of enchanted forest">map of
the enchanted forest.</A>
</BODY>

12.1.5 Internationalization and links

Since links may point to documents encoded with different character encodings, the A and LINK elements support the charset attribute. This attribute allows authors to advise user agents about the encoding of data at the other end of the link.

The hreflang attribute provides user agents with information about the language of a resource at the end of a link, just as the lang attribute provides information about the language of an element's content or attribute values.

Armed with this additional knowledge, user agents should be able to avoid presenting "garbage" to the user. Instead, they may either locate resources necessary for the correct presentation of the document or, if they cannot locate the resources, they should at least warn the user that the document will be unreadable and explain the cause.

12.2 The A element

<!ELEMENT A - - (%inline;)* -(A)       -- anchor -->
<!ATTLIST A
  %attrs;                              -- %coreattrs, %i18n, %events --
  charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  name        CDATA          #IMPLIED  -- named link end --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  accesskey   %Character;    #IMPLIED  -- accessibility key character --
  shape       %Shape;        rect      -- for use with client-side image maps --
  coords      %Coords;       #IMPLIED  -- for use with client-side image maps --
  tabindex    NUMBER         #IMPLIED  -- position in tabbing order --
  onfocus     %Script;       #IMPLIED  -- the element got the focus --
  onblur      %Script;       #IMPLIED  -- the element lost the focus --
  >

Start tag: required, End tag: required

Attribute definitions

name = cdata [CS]
This attribute names the current anchor so that it may be the destination of another link. The value of this attribute must be a unique anchor name. The scope of this name is the current document. Note that this attribute shares the same name space as the id attribute.
href = uri [CT]
This attribute specifies the location of a Web resource, thus defining a link between the current element (the source anchor) and the destination anchor defined by this attribute.
hreflang = langcode [CI]
This attribute specifies the base language of the resource designated by href and may only be used when href is specified.
type = content-type [CI]
When present, this attribute specifies the content type of a piece of content, for example, the result of dereferencing a URI. Content types are defined in [MIMETYPES].
rel = link-types [CI]
This attribute describes the relationship from the current document to the anchor specified by the href attribute. The value of this attribute is a space-separated list of link types.
rev = link-types [CI]
This attribute is used to describe a reverse link from the anchor specified by the href attribute to the current document. The value of this attribute is a space-separated list of link types.
charset = charset [CI]
This attribute specifies the character encoding of the resource designated by the link. Please consult the section on character encodings for more details.

Attributes defined elsewhere

Each A element defines an anchor

  1. The A element's content defines the position of the anchor.
  2. The name attribute names the anchor so that it may be the destination of zero or more links (see also anchors with id).
  3. The href attribute makes this anchor the source anchor of exactly one link.

Authors may also create an A element that specifies no anchors, i.e., that doesn't specify href, name, or id. Values for these attributes may be set at a later time through scripts.

In the example that follows, the A element defines a link. The source anchor is the text "W3C Web site" and the destination anchor is "http://www.w3.org/":

For more information about W3C, please consult the 
<A href="http://www.w3.org/">W3C Web site</A>. 

This link designates the home page of the World Wide Web Consortium. When a user activates this link in a user agent, the user agent will retrieve the resource, in this case, an HTML document.

User agents generally render links in such a way as to make them obvious to users (underlining, reverse video, etc.). The exact rendering depends on the user agent. Rendering may vary according to whether the user has already visited the link or not. A possible visual rendering of the previous link might be:

For more information about W3C, please consult the W3C Web site.
                                                   ~~~~~~~~~~~~

To tell user agents explicitly what the character encoding of the destination page is, set the charset attribute:

For more information about W3C, please consult the 
<A href="http://www.w3.org/" charset="ISO-8859-1">W3C Web site</A> 

Suppose we define an anchor named "anchor-one" in the file "one.htm".

...text before the anchor...
<A name="anchor-one">This is the location of anchor one.</A>
...text after the anchor...

This creates an anchor around the text "This is the location of anchor one.". Usually, the contents of A are not rendered in any special way when A defines an anchor only.

Having defined the anchor, we may link to it from the same or another document. URIs that designate anchors contain a "#" character followed by the anchor name (the fragment identifier). Here are some examples of such URIs:

Thus, a link defined in the file "two.htm" in the same directory as "one.htm" would refer to the anchor as follows:

...text before the link...
For more information, please consult <A href="./one.htm#anchor-one"> anchor one</A>.
...text after the link...

The A element in the following example specifies a link (with href) and creates a named anchor (with name) simultaneously:

I just returned from vacation! Here's a
<A name="anchor-two" 
   href="http://www.somecompany.com/People/Ian/vacation/family.png">
photo of my family at the lake.</A>.

This example contains a link to a different type of Web resource (a PNG image). Activating the link should cause the image resource to be retrieved from the Web (and possibly displayed if the system has been configured to do so).

Note. User agents should be able to find anchors created by empty A elements, but some fail to do so. For example, some user agents may not find the "empty-anchor" in the following HTML fragment:

<A name="empty-anchor"></A>
<EM>...some HTML...</EM>
<A href="#empty-anchor">Link to empty anchor</A>

12.2.1 Syntax of anchor names

An anchor name is the value of either the name or id attribute when used in the context of anchors. Anchor names must observe the following rules:

Thus, the following example is correct with respect to string matching and must be considered a match by user agents:

<P><A href="#xxx">...</A>
...more document...
<P><A name="xxx">...</A>

ILLEGAL EXAMPLE:
The following example is illegal with respect to uniqueness since the two names are the same except for case:

<P><A name="xxx">...</A>
<P><A name="XXX">...</A>

Although the following excerpt is legal HTML, the behavior of the user agent is not defined; some user agents may (incorrectly) consider this a match and others may not.

<P><A href="#xxx">...</A>
...more document...
<P><A name="XXX">...</A>

Anchor names should be restricted to ASCII characters. Please consult the appendix for more information about non-ASCII characters in URI attribute values.

12.2.2 Nested links are illegal

Links and anchors defined by the A element must not be nested; an A element must not contain any other A elements.

Since the DTD defines LINK element to be empty, LINK elements may not be nested either.

12.2.3 Anchors with the id attribute

The id attribute may be used to create an anchor at the start tag of any element (including the A element).

This example illustrates the use of the id attribute to position an anchor in an H2 element. The anchor is linked to via the A element.

You may read more about this in <A href="#section2">Section Two</A>.
...later in the document
<H2 id="section2">Section Two</H2>
...later in the document
<P>Please refer to <A href="#section2">Section Two</A> above
for more details.

The following example names a destination anchor with the id attribute:

I just returned from vacation! Here's a
<A id="anchor-two">photo of my family at the lake.</A>.

The id and name attributes share the same name space. This means that they cannot both define an anchor with the same name in the same document.

ILLEGAL EXAMPLE:
The following excerpt is illegal HTML since these attributes declare the same name twice in the same document.

<A href="#a1">...</A>
...
<H1 id="a1">
...pages and pages...
<A name="a1"></A>

Because of its specification in the HTML DTD, the name attribute may contain character references. Thus, the value D&#xfc;rst is a valid name attribute value, as is D&uuml;rst . The id attribute, on the other hand, may not contain character referenc.

Use id or name? Authors should consider the following issues when deciding whether to use id or name for an anchor name:

12.2.4 Unavailable and unidentifiable resources

A reference to an unavailable or unidentifiable resource is an error. Although user agents may vary in how they handle such an error, we recommend the following behavior:

12.3 Document relationships: the LINK element

<!ELEMENT LINK - O EMPTY               -- a media-independent link -->
<!ATTLIST LINK
  %attrs;                              -- %coreattrs, %i18n, %events --
  charset     %Charset;      #IMPLIED  -- char encoding of linked resource --
  href        %URI;          #IMPLIED  -- URI for linked resource --
  hreflang    %LanguageCode; #IMPLIED  -- language code --
  type        %ContentType;  #IMPLIED  -- advisory content type --
  rel         %LinkTypes;    #IMPLIED  -- forward link types --
  rev         %LinkTypes;    #IMPLIED  -- reverse link types --
  media       %MediaDesc;    #IMPLIED  -- for rendering on these media --
  >

Start tag: required, End tag: forbidden

Attributes defined elsewhere

This element defines a link. Unlike A, it may only appear in the HEAD section of a document, although it may appear any number of times. Although LINK has no content, it conveys relationship information that may be rendered by user agents in a variety of ways (e.g., a tool-bar with a drop-down menu of links).

This example illustrates how several LINK definitions may appear in the HEAD section of a document. The current document is "Chapter2.htm". The rel attribute specifies the relationship of the linked document with the current document. The values "Index", "Next", and "Prev" are explained in the section on link types.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
   "http://www.w3.org/TR/REC-html40/strict.dtd">
<HTML>
<HEAD>
  <TITLE>Chapter 2</TITLE>
  <LINK rel="Index" href="../index.htm">
  <LINK rel="Next"  href="Chapter3.htm">
  <LINK rel="Prev"  href="Chapter1.htm">
</HEAD>
...the rest of the document...

12.3.1 Forward and reverse links

The rel and rev attributes play complementary roles -- the rel attribute specifies a forward link and the rev attribute specifies a reverse link.

Consider two documents A and B.

Document A:       <LINK href="docB" rel="foo">

Has exactly the same meaning as:

Document B:       <LINK href="docA" rev="foo">

Both attributes may be specified simultaneously.

12.3.2 Links and external style sheets

When the LINK element links an external style sheet to a document, the type attribute specifies the style sheet language and the media attribute specifies the intended rendering medium or media. User agents may save time by retrieving from the network only those style sheets that apply to the current device.

Media types are further discussed in the section on style sheets.

12.3.3 Links and search engines

Authors may use the LINK element to provide a variety of information to search engines, including:

The examples below illustrate how language information, media types, and link types may be combined to improve document handling by search engines.

In the following example, we use the hreflang attribute to tell search engines where to find Dutch, Portuguese, and Arabic versions of a document. Note the use of the dir and charset attributes for the Arabic manual, and the use of the lang attribute to indicate that the value of the title attribute for the LINK element designating the French manual is in French.

<HEAD>
<TITLE>The manual in English</TITLE>
<LINK title="The manual in Dutch"
      type="text/html"
      rel="alternate"
      hreflang="nl" 
      href="http://someplace.com/manual/dutch.htm">
<LINK title="The manual in Portuguese"
      type="text/html"
      rel="alternate"
      hreflang="pt" 
      href="http://someplace.com/manual/portuguese.htm">
<LINK title="The manual in Arabic"
      dir="rtl"
      type="text/html"
      rel="alternate"
      charset="ISO-8859-6"
      hreflang="ar" 
      href="http://someplace.com/manual/arabic.htm">
<LINK lang="fr" title="La documentation en Fran&ccedil;ais"
      type="text/html"
      rel="alternate"
      hreflang="fr"
      href="http://someplace.com/manual/french.htm">
</HEAD>

In the following example, we tell search engines where to find the printed version of a manual.

<HEAD>
<TITLE>Reference manual</TITLE>
<LINK media="print" title="The manual in postscript"
      type="application/postscript"
      rel="alternate"
      href="http://someplace.com/manual/postscript.ps">
</HEAD>

In the following example, we tell search engines where to find the front page of a collection of documents.

<HEAD>
<TITLE>Reference manual -- Page 5</TITLE>
<LINK rel="Start" title="The first page of the manual"
      type="text/html"
      href="http://someplace.com/manual/start.htm">
</HEAD>

Further information is given in the notes in the appendix on helping search engines index your Web site.

12.4 Path information: the BASE element

<!ELEMENT BASE - O EMPTY               -- document base URI -->
<!ATTLIST BASE
  href        %URI;          #REQUIRED -- URI that acts as base URI --
  >

Start tag: required, End tag: forbidden

Attribute definitions

href = uri [CT]
This attribute specifies an absolute URI that acts as the base URI for resolving relative URIs.

Attributes defined elsewhere

In HTML, links and references to external images, applets, form-processing programs, style sheets, etc. are always specified by a URI. Relative URIs are resolved according to a base URI, which may come from a variety of sources. The BASE element allows authors to specify a document's base URI explicitly.

When present, the BASE element must appear in the HEAD section of an HTML document, before any element that refers to an external source. The path information specified by the BASE element only affects URIs in the document where the element appears.

For example, given the following BASE declaration and A declaration:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
   "http://www.w3.org/TR/REC-html40/strict.dtd">
<HTML>
 <HEAD>
   <TITLE>Our Products</TITLE>
   <BASE href="http://www.aviary.com/products/intro.htm">
 </HEAD>

 <BODY>
   <P>Have you seen our <A href="../cages/birds.gif">Bird Cages</A>?
 </BODY>
</HTML>

the relative URI "../cages/birds.gif" would resolve to:

http://www.aviary.com/cages/birds.gif

12.4.1 Resolving relative URIs

User agents must calculate the base URI for resolving relative URIs according to [RFC1808], section 3. The following describes how [RFC1808] applies specifically to HTML.

User agents must calculate the base URI according to the following precedences (highest priority to lowest):

  1. The base URI is set by the BASE element.
  2. The base URI is given by meta data discovered during a protocol interaction, such as an HTTP header (see [RFC2068]).
  3. By default, the base URI is that of the current document. Not all HTML documents have a base URI (e.g., a valid HTML document may appear in an email and may not be designated by a URI). Such HTML documents are considered erroneous if they contain relative URIs and rely on a default base URI.

Additionally, the OBJECT and APPLET elements define attributes that take precedence over the value set by the BASE element. Please consult the definitions of these elements for more information about URI issues specific to them.

Link elements specified by HTTP headers are handled exactly as LINK elements that appear explicitly in a document.