Web pages - the basics

The web has come a long way, but isn't it amazing that it still works after all these years?

Web pages are just text instructions that computers provide to each other. When someone wants to see a web page, they instruct their web browser to get it for them. The web browser gets the IP address of the website through DNS and then sends a stream of text which is a request to that website for the page.

The web server software sends back a stream of text which describes the page to the browser.

Note that there are no pictures or videos involved thus far. This is all just text to and from a browser and a server.

The browser then downloads an additional resources that the text indicates will be needed for the page. Sometimes that's a video file, often it's just a picture file. These other files and potentially other pages that are downloaded to complete the page don't even need to come from the same webserver that provided the original page.

As the browser downloads the page, it examines the text and organizes the information to show to the user.

You can create a web page with very little effort. For example a Windows user might put this text into notepad:

<html><head><title>Hi.</title></head><body>This is my page</body></html>

Once that text is saved as a file with the extension ".htm" or ".html" it's ready to be put on a web server or even just opened directly in a web browser.

Of course most web pages are much more complex than that example. You can right click on most web pages and choose "View Source" to see what text is actually creating the page you're looking at.

Cached files:

The files that aren't part of the text have to be copied to your computer in order for you to see them. They aren't usually kept by your computer for very long, but they have to be kept as long as the page is open and are usually kept for a while after that in order to keep you from needing to download them again if you go to that page or one that uses the same files again. This short term storage is referred to as caching and your web experience is much better for it. However, sometimes the files that are stored are no longer matching what the server is providing, which means you're no longer getting the pages that you asked for. That's why you are sometimes told to refresh a page or to clear your cache. Typically holding down Ctrl and pressing F5 or R will cause the browser to ignore the cached files and load new ones. Other times, you may do better with Ctrl+Alt and F5 or R. If you don't know what special key combinations are best, you can always go through the menus to accomplish the same thing.

Cookies:

When your browser asks for a web page, it also sends any special text it has stored at the request of the page. When a web page is sent to a browser, it often contains a little extra text at the beginning that it asks the browser to keep up with so that it can provide a customized experience. These little strings of text also have expiration dates associated with them if they're intended to last beyond the current visit.

Each time the browser goes back to that page or a page on the same site, it sends that text along with it's request. You can of course tell your browser not to send the text, or the text may expire, but having the browser remember information means the website can "remember" what you're doing.

These little text entries that are remembered (or not) and sent with the page requests (or not) are referred to as cookies. In themselves, they're quite useful and somewhat essential. The downside of cookies is that when websites embed pages from advertisers inside their own pages, the advertisers' servers can ask your browser to remember something for the advertiser. That means that if you go from one site to a completely different site and both use the same advertiser, the advertiser can coordinate your activity with visits to both sites.

Some people would rather advertisers didn't keep track of the sites they go to. Unfortunately that preference means working at removing cookies either all the time, which means that sites won't recognize you, auto-fill your username, store your preferences or do anything else custom to you. Or, you'll have to find ways to remove cookies from the sites you don't want while leaving the ones you do want intact.