Semantic HTML is the use of HTML markup to reinforce the semantics, or meaning, of the information in webpages and web applications rather than merely to define its presentation or look. Semantic HTML is processed by traditional web browsers as well as by many other user agents. CSS is used to suggest its presentation to human users.
As an example, recent HTML standards discourage use of the tag <i>
(italic, a typeface) in preference of more accurate tags such as <em>
(emphasis); the CSS stylesheet should then specify whether emphasis is denoted by an italic font, a bold font, underlining, slower or louder audible speech etc. This is because italics are used for purposes other than emphasis, such as citing a source; for this, HTML 4 provides the tag <cite>
. Another use for italics is foreign phrases or loanwords; web designers may use built-in XHTML language attributes or specify their own semantic markup by choosing appropriate names for the class
attribute values of HTML elements (e.g. class="loanword"
). Marking emphasis, citations and loanwords in different ways makes it easier for web agents such as search engines and other software to ascertain the significance of the text.
HTML has included semantic markup since its inception. In an HTML document, the author may, among other things, "start with a title; add headings and paragraphs; add emphasis to [the] text; add images; add links to other pages; [and] use various kinds of lists".
Various versions of the HTML standard have included presentational markup such as <font>
(added in HTML 3.2; removed in HTML 4.0 Strict), <i>
(all versions) and <center>
(added in HTML 3.2). There are also the semantically neutral span and div tags. Since the late 1990s when Cascading Style Sheets were beginning to work in most browsers, web authors have been encouraged to avoid the use of presentational HTML markup with a view to the separation of presentation and content.