Javascript is a language for programming: a sequence of instructions cause actions to be taken. HyperText Markup Language, HTML, is used to describe a document. While Javascript has keywords and operators for declaring variables, initializing loops, and computing values, HTML only has keywords used to describe things like paragraphs, section headings, tables, and lists. You can’t compute with HTML, but you can describe web pages. Later, you can mix in Javascript to make those pages more active and interactive.
If you write a document in Word, you can describe the typesetting style of a document directly. For example, you can select some words, set the font size to 134, make the font bold, and change the font color to purple. You can do that for the heading text for each section of the document, if you want.
The problem with setting styles in this way is that it is hard to maintain a uniform style across documents, or to change styles quickly as a group. In Word, if I later decide that the headings of your document should be orange and flashing in a gothic font, I’ll need to go through the entire document and select each piece of heading text, and then change the font size and style. And what if I miss one? That would be ugly.
HTML only lets you specify the intended use, not the style, of a section of text. You can specify that some text should be a header. When the browser displays that text, it will display that text using the style for headers. You can specify that style using a different language, CSS. By separating meaning (or semantics) and style, it is possible to display documents using consistent styles, and change those styles in nice simple ways.
Tags, written using angle brackets, specify the intended use of text. Click on the ► button to see how the HTML example below is displayed:
The example HTML uses <h1>
to mark a section header and <p>
to mark a paragraph. Tags almost always come in pairs: a start tag like <h1>
, and an ending tag with the same name following a forward slash, like </h1>
.
It might seem annoying to have to start and end each paragraph using a paragraph tag, but the advantage is that the HTML parser (the program that reads and interprets the HTML) knows with perfect certainty that you intended the text within the paragraph markers to be a paragraph. Perhaps you prefer to typeset paragraphs with some extra vertical space between them (as is the current setting on this page and in the display of the Constitution), or perhaps you are publishing a book and paragraphs should not be separated. By marking paragraphs with tags, you can make the styling decision later.
You may have noticed the 1
in the <h1>
tag. Section headings may be of different levels of importance. The top level h1
tag specifies that this heading is of primary importance: perhaps it is the title of the document, or the title of a chapter. An h2
tag would specify a second-level heading. For example, a document might have sections with h2
headings. And those sections might have subsections with h3
headings. In general, it is considered good practice to not skip headings on the way down: if you used an h1
for a previous heading, you would typically not skip down to h3
directly for a subsection, but would use an h2
next.
Objective: use header and paragraph tags in an .html document.
Here’s a transcription of the constitution, in case you have not yet memorized it. Write HTML code including the title (The Constitution…), up through the first two paragraphs of Section 2.
Here is a solution.
In the Constitution example above, each starting tag was followed by some text and then directly by its closing tag.
<p> We the People of the United States, in Order...</p>
What if we wanted to make We the People of the United States
stand out, since in the original document, it’s written larger? We could use the <strong>
tag to indicate that this text should be strongly emphasized when typeset, perhaps by making the text bold, or purple, or flashing orange.
Notice that We the People...
is still part of the first paragraph, so it should be inside the paragraph tags, but since it is strong, it should be inside strong tags. The pattern is <p> <strong> some text </strong> <p>
. We say that the strong
tags are nested inside the paragraph tags. Tags can be nested as deeply as needed, but some nestings don’t make sense; for example, you cannot nest a paragraph inside a paragraph.
The pieces of HTML code we have seen so far are really just fragments of web pages. There is other information that the browser needs in order to present a web page correctly, and that search engines need to correctly index the page. Here is a complete example of a simple web page; when writing a new web page, you are welcome to copy-and-paste this text as your starting point. (You do not need to attribute the author, since this is a generic, standard structure, containing no intellectual contribution.)
Let’s look at the structure of the document. There are many types of documents on the web; it’s easiest for a browser to identify a document as an HTML page if the first line of the HTML document declares the document type. So, HTML documents should start with the line <!DOCTYPE html>
. This is not really a tag, and so it appears without a paired closing tag.
<!DOCTYPE html>
is not really part of the HTML language, but the rest of the document is HTML. Traditionally, the HTML part of the document is enclosed in a <html> </html>
tag pair. Scroll down in the example to find the closing </html>
tag.
Indentation is used to show nested tags. In the example, everything inside the <html> </html>
tag pair is indented.
Complete HTML documents are separated into two sections, the header and the body, both of which are within the HTML of the document. Information for the browser is placed into the header and not displayed; the body contains the material to be shown on screen.
The document should have a title that will be displayed in the browser tab, or used as the name when you bookmark the page. The title of this document that you are reading is . You can set the title of the document using the <title></title>
tag.
The title is not displayed anywhere on the page itself, but is just used internally by the browser. Therefore, it occurs in the header section of the HTML document.
Do not confuse the words heading and header. The header is the first section of the HTML document where information for the browser is placed; headings, like those marked by the <h1></h1>
tag pair, are used to put titles on parts of the content text.
Even if you make a mistake in your HTML, for example, by forgetting to add <!DOCTYPE html>
, most browsers will take their best guess and display your page anyway. This is problematic, since different browsers might do different things, giving different users different experiences. It’s good practice to use an HTML validator to verify that your HTML code is legitimate. Try the validator now on a few favorite web pages, including this one. As of this writing, www.bloomberg.com
has several warnings and errors. Oops.
The <ul>
tag creates an unordered list of items, which might be displayed as a bulletted list. Each new item is described by a <li></li>
tag pair. Here’s an example.
Because the list items are contained within the list, the items are indented. This is not a requirement, but is good stylistic practice – it makes the document easier for the original author or later maintainers of the web page to read.
An ordered list is typically used to display a list with numbers or letters preceding the items. Ordered lists are indicated using <ol></ol>
tags; items are marked using <li></li>
tags. Change the list in the above example into an ordered list; don’t forget to change the closing tag.
To write an outline, you can nest ordered lists inside of other lists:
This outline includes only arabic numerals, rather than letters or Roman numerals; a quick web search will tell you how to make this outline look more like a traditional outline, but we will not go into the details.
Objective: Modify HTML code.
Tables, denoted by <table></table>
tag pairs, contain rows, marked by <tr></tr>
tags. Each row contains table data entries, each marked by <td></td>
tag pairs. Here is an example:
When you display the table, you’ll notice that the headers for the columns (dates) and rows (stock symbols) are not displayed any differently than any other table data items. Row and column headers should be marked with the <th>
tag rather than the <td>
tag. Fix the table now.
The first line of the last exercise starts with <!--
. This is not really a tag, but rather, indicates the start of of a comment that will not ever be displayed. Comments are notes for the human reader. Comments begin with <!--
and end with -->
. Be careful: anyone can view the HTML source of a webpage that you put on the web, so these comments will be viewable to anyone who wants to see them, even though they do not show up on the browser’s standard display of the web page.
Tags in HTML surround text, whether that text is a paragraph, a heading, or an item in a list. But HTML documents also include links to other documents, or to images for display. The <img>
tag is used to indicate that an image should be displayed. The <img>
tag is not followed by a closing </img>
tag.
To specify where the image that is to be displayed should be found, the <img>
tag includes an an attribute, src
. The attribute is assigned a string value (using quotation marks) that indicates the URL where the image is located. The URL may be the complete web address of the image, as in the example, or it might be simply the name of the file, if the file is located in the same directory as the current web page.
The example <img>
tag includes an optional attribute, style
, which can be used to set the width and height to which the image should be scaled. (The original image of the Earth is 2048x2048 pixels, which is too large to display in this little window.)
The image file is not included in the .html
file in any way. If you reference an image in the current directory, and then move the html file somewhere else, the image will not follow the .html
file, and no image will be displayed.
The web is always changing. If you use a complete image URL, and the image at that URL is no longer accessible at some point, no image will be displayed. To avoid this potential problem, it is often better to copy the image file into your directory and link to the new, local version.
There are tradeoffs when considering copying resources like images into your own directory, vs. linking to a remote resource from some other location. You may be charged for bandwidth (network transfer costs) for resources that you keep locally.
There are also copyright considerations. Do not, under any circumstances, assume that any particular image found on the web can be used by your web site, with or without attribution.
By default, the creator of an image retains copyright, and you cannot use that image on your site, whether included locally or linked remotely with an img
tag. However, some images might have licenses that permit you specific rights. For example, the Creative Commons Attribution 4.0 license that many images on Wikipedia use may allow others to use images, as long as certain rules for attribution (citation) are followed carefully.
Links to other documents have two primary components: some link text (the words displayed in your browser), specified as the text between <a>
and </a>
tags, and the URL of the page you want to link to, specified using the src
attribute of the <a>
tag:
The link in this example is nested inside a paragraph, but you might imagine nesting them inside of list items, to create an unordered list of links, or inside of table data items, to create a table of links.
You can nest an image inside a link to use the image as a link.
Objective: Combine image and link tags to make an image link.
Use the image of the Earth from the previous example as a link to the wikipedia article on “space” (https://en.wikipedia.org/wiki/Solar_System
):
Here is a solution.
You can embed a web page within a web page, using the <iframe>
tag. Why would you ever want to do this? The small HTML coding windows on this very page are one example: the HTML text is written in the editor on the left (a fancy textarea), but an iframe is used to display the rendered output on the right.
Another reason to use an iframe is because a particular website provides some complex service, and you want to make use of that service in your web page. Embedding youtube videos in your web page is an example of this: you’d like to display a fancy video player, and youtube provides some HTML code to embed in your web page to make it happen.
I found the following HTML code by finding a video on youtube, clicking on share and then embed.
<iframe width="560" height="315" src="https://www.youtube.com/embed/ZCBE8ocOkAQ?rel=0" frameborder="0" allowfullscreen></iframe>
In the HTML code above, the text ?rel=0
is my own addition, and requests that youtube not play any related movies after the Falcon lands. This is an example of a GET request, a way of passing information to the server (the remote computer that sends the webpage across the network to your computer) as part of the request for the webpage. GET requests will be discussed in more detail in a later chapter.
iframes have a benefit for the website providing the resource: the content of that website is displayed in the manner the provider prefers. We displayed a youtube video through an iframe view of the youtube website; if youtube.com
suddenly decides to play an ad before or after that video, it can do so. Or youtube could decide to ignore the ?rel=0
hint, and play related videos. youtube.com
also gets information about visitors to the site, even if those visits happen through iframes. You’ve just visited the youtube.com site!
There are some restrictions on iframes. For example, a website provided over a secure encrypted connection (such as this one), cannot include an iframe reference to an unsecure website, since a website with unsecure content cannot itself be secure. You can tell if a website is provided through an encrypted connection if the address begins with https://
rather than http://
.
The decision of who owns and controls the content of your website is an important one. For example, if you use an <img>
tag to embed a remotely-provided image, the owner of that image can change what image appears at that web address, or delete the image entirely, breaking your site. It might be better to download the image and store it locally, and link to the local copy. Similarly, iframes should be used very sparingly, and link only to trusted providers.
When you type an address into a web browser, or follow a link from a web page, your computer requests that some remove computer send you the content of a document. Let’s call your computer the client and the remote computer the server. Your reqest may pass through several intermediate computers on its way to the server, and the response from the server may pass through several intermediate computers on its way back to the client.
Depending on the type of information sent, this could be a problem. If you send your credit card information to Amazon, but on the way, it passes through some other computer controlled by evil people, you might find yourself with a large credit card bill for things you did not order. If you have permission to download some secret designs for a new invention, and do so, those designs might pass through the hands of anybody who has control of any computer between the server and your machine, the client.
Or maybe I build a web page that looks like Amazon’s, place myself in between some people and Amazon (not hard to do; most computers on the internet are on some route to Amazon), and collect personal information by impersonating Amazon: a spoofing attack.
Encyrption is used to encode information at the source before sending it to the destination, so that only the intended computer can decode and understand the information.
Creating an encrypted connection requires some additional computational work, and so not all websites are encrypted. You can tell the difference based on the address: sites that start with the prefix https://
are provided through an encrypted connection, and sites that start with the prefix http://
are not. As a rule of thumb, it’s wise to enable encryption for commercial websites.
I see that amazon.com
’s main web page is not provided over an encrypted channel, even if I log in. However, the page where I enter my credit card information is. mint.com
is provided over an encrypted channel. In addition to the web address, you can verify encryption by looking for a lock icon in your browser’s address bar.
I think that all section headings in a web page should be purple, and that web pages should have orange backgrounds; I’m apparently not alone. Let’s do it:
The <style>
tag starts the description of your modifications to the style of the document. The value of the type
attribute in <style type="text/css">
tells the web browser that the language used to describe the style will be Cascading Style Sheets, or CSS for short.
The CSS language is quite simple: it is a list of rules specifying how elements should be displayed. Each rule starts with a selector: a text description that selects some elements of the HTML document. In the example, h1
in the style section selects all h1
elements in the HTML document. Following each selector, within a pair of curly braces {}
, there is a list of style properties and the value(s) for each. Each property has a name, and can take certain values. For example, the color
property can take on certain named color values, like purple
, or web colors described by a hexadecimal number following a pound sign, #
.
There is a semicolon after each property assignment, much like in Javascript.
By default, HTML elements like images are displayed on the left side of the page, with nothing else on the same line:
The float
property lets you specify different behavior, and can take on values like left
, right
, or none
(the default behavior). If you float the image to the right, then the image will be placed on the right-hand side of the screen, and text that appears later in the HTML code will flow around the image. Clever use of floats is a large part of how modern web pages position elements on a page.
Float the image to the right, using the float
property.
Here is a solution.
id
attributeThe style rules so far select all elements of a particular type. What if you had many images in the document, and only wanted one of them to float to the right? Or wanted only one heading out of many to be red? We need a way to select a particular element of the HTML page.
In order to specify a particular element, we need to use both HTML and CSS. In the HTML, we need to give a name, called the id
to the element using the special id
attribute. In the CSS, we need to use a selector that works based on the id, rather than the tag name. Let’s look at a simple example with some HTML h3
tags.
There are a few things to note:
redrum_heading
. (Well, it might work; browsers are forgiving. But don’t do it!)id
attribute.class
attributeAn id
should be used to select a specific, named element from the HTML, and each id should be unique in the HTML document. But sometimes, you want to select several elements. For example, perhaps you have several paragraphs in the text of your new novel, but you decide that some of the paragraphs are warning notes, and should be typeset differently:
The class
attribute in the HTML can be used to specify that a particular element is in a group, called a class, with the class name given by the value of the attribute.
In the CSS, we can specify that a rule should apply to all elements of a particular class using a dot followed by the name of the class.
Note. It is tempting to overuse the class
attribute. For example, imagine you wanted to have block quotes in a Wikipedia article about Mark Twain. It would be tempting to use a class="myquote"
tag in quoted paragraphs, and typeset all such paragraphs using a style rule. But in fact, a <blockquote>
tag already exists, and would be a much better choice, since using the tag would indicate more directly that this is really a quote, and not just some class you happened to call “myquote”. The next maintainer of the page will know exactly what is intended by the use of the proper tag.
Where possible, it’s also good to use classes semantically, to indicate meaning or intent. For example, we called the class above warning
rather than red_text_with_a_box_around_it
. That way, if we want to change warnings to be orange, we can do so without having to change the name of the class.
span
tagSometimes, you’d like to give a piece of text a name or a class, but you don’t want to put that text in its own paragraph. The <span>
tag can be used to create a new HTML element that by default, doesn’t have any special formatting applied to it. Then you can give the span tag an id or a class, and use CSS to style that piece of the document. We’ll see an example in the next section.
div
tagThe div
tag works much the same as the span
tag: it’s a generic element that can be used to package some HTML code into an element, and give that element a class
or id
. Once an element has a class or id, it can be styled, or as we will see soon, manipulated using Javascript.
The span
and div
tag are primarily different in how they are displayed. A span
element is displayed inline: just like the <strong>
tag that we saw earlier, the location of text inside a span
element appears in the same place it would without the span tags, while a div
is a block element that, like a paragraph, appears on its own lines in the document, unless you specify otherwise with style rules.
Because div
elements are meant to be displayed separately as blocks, it does not make much sense to use a <div>
tag nested inside of paragraph <p>
tag pairs. (It also doesn’t make sense to nest paragraphs inside of paragraphs. That would just be weird.)
You can think of the div
tag as being used to divide the document into large pieces, called divisions, while the span tag isolates and identifies small pieces of the document for styling or manipulation.
You may have noticed that there is a dot in the style rule .important
in the previous example. As mentioned above, the dot tells CSS that important
is the name of a class, not the name of a tag: apply this style rule to all HTML elements that are in the class important
. In fact, we expect many tags, possibly even of different types, to be labelled in the same class: that’s the reason for using a class.
In a CSS style rule, the text before the style changes is called the selector; remember that you use pound #
to select an element by id, dot .
to select a group of elements by class, and a tag type to select a group of elements based on tag type.
Put the second list item in the important
class. If you are successful, the second list item (including the number 2) should turn red.
Here is a solution.
There are some other tricks. For example, you can follow the id, class, or tag specification with a colon and the keyword hover
to select items of that type that the mouse is hovering over:
The hover
keyword is frequently used to build navigation menus that link to other pages or cause some action to happen when clicked on: the change of color hints to the user that some exciting action will happen if they click on that item.
CSS selectors can be even more complicated. For example, if you’d like to select all list items from the class important
(but not important paragraphs or unimportant list items), you could use the query selector li .important
.
Modify this code so that only the hummingbird and the UFO are red, by putting those list items into the same class, and modifying the selector.
Here is a solution.
You can also select multiple tag types, classes, or ids with the same selector by using commas between the components of the selector. For example p, li:hover
would select all paragraphs, and all list items with the mouse over them, and apply the same rule to each.
Make all paragraphs and hovered list items red.
Here is a solution.
You can select elements within elements. For example, you could use a div
tag to label a section of the document as being the navigation bar (with id="navbar"
, perhaps), and then select within that all list items within the navbar that were hovering and color them red.
The following example uses the tools described above, together with a few new style properties, to create a navigation bar for a simple web page. Read the code together with the discussion that follows the code.
First, read the HTML code within the <body>
tag pair, near the bottom of the document. First, there is a div
with the id “navbar”. This div contains an unordered list; each list item itself contains a link to a web page.
Following the navbar div, you’ll see a paragraph and another unordered list. They are not very interesting, and just provide some text content for the page.
The goal is to use CSS to style both the navbar div itself, and the HTML elements within the div (the list, the list items, and the links). In the navbar, we want:
Exercise: In the above example, delete or change CSS properties to experiment with spacing between elements (using padding), colors, and whether or not links are underlined.
CSS is a simple language. You have just learned the fundamental structure. However, there are a bewildering number of properties that control style. If you’d like to go further with how to style web pages, the recommended text Learn to Code HTML & CSS will be helpful.
A great way to learn to write web pages is to view the source of a favorite webpage. On Chrome, you can go to the View: Developer: View Source menu item to see the HTML code for a page you are looking at.
Some web pages are crafted by hand. Some are generated automatically by a machine. Some are simple. Some are complicated. So if the HTML you find is poorly formatted, or beyond what you can understand, look for another web page. You are welcome to look at the code for this page now. We have nothing to hide, but since these pages are partially formatted by automatic translation from yet another language (markdown), we do not hold them up as an example of pretty HTML code.