Tag soup

Therefore there is a need for all browser implementations to provide mechanisms to cope with the appearance of "tag soup", accepting and correcting for invalid syntax and structure where possible.

All major web browsers currently have a tag soup parser for interpreting malformed HTML, with most error-handling elements standardized.

I have used this term in my instruction for years to characterize the jumble of angle brackets acting like tags in HTML in pages that are accepted by browsers.

[...] I've never seen the term defined anywhere.The Markup Validation Service is a resource for web page authors to avoid creating tag soup.

[2] While many graphical web editors produce well-formed markup, an author writing code manually with a text-editor and then testing only in one browser can easily miss such errors.

In response to this pressure, browser makers unilaterally added new proprietary features to HTML that fell outside the standards at the time.

To some extent, this problem was slowed by the introduction of new standards by the W3C, such as CSS, introduced in 1998, which helped to provide greater flexibility in the presentation and layout of web pages without the need for large numbers of additional HTML elements and attributes.

In 2004, Apple, Mozilla and Opera founded the WHATWG, with the intent of creating a new version of the HTML specification which all browser behavior would match.

As more browsers support newer revisions of standards, the pressure on web developers to use non-standard code to solve problems diminishes.

The XML specification clearly defines that a conforming user agent (such as a web browser) must not accept a document, and not continue parsing it, if any syntactical error is encountered.

Without the directives of XML, HTML browsers must use complex algorithms to infer the author's intended meaning in a wide range of cases where invalid syntax is encountered.

In providing namespaces, XHTML combined with CSS allow authoring communities to easily extend the semantic vocabulary of documents.

By contrast to XHTML, which departs from backwards compatibility and takes the approach that parsers should become less tolerant of badly formed markup, HTML5 acknowledges that badly formed HTML code already exists in large quantities and will probably continue to be used, and takes the view that the specification should be expanded to ensure maximum compatibility with such code.

Unlike the strict XHTML, HTML and its predecessor SGML are designed to be written by humans, and already have a significant degree of flexibility in syntax to reduce boilerplate.

[10] Despite their validity, these omissions still require a special parser with a knowledge of HTML (as opposed to the more rigid XML) to parse.