Introduction to the WWWUI 2023-2024

Word Wide Web

What is The World Wide Web?

The World Wide Web is an interconnected system that connects a set of web pages between the internet. It is important to not confuse the Internet with WWW. WWW is only one of the applications that was built above internet.

Who was the creator?

Tim Berners-Lee proposed the architecture of the WWW, creating the first Server and the first browser in 1990, and published in 1991

Components of the Word Wide Web

HTTP Protocol

HTTP Methods:

This protocol provides a series of methods to communicate with a server from a client. There are 6 methods with different uses and rules:

DELETE

Method in charge of delete a resource from the server. The resource could be a file, an entry on a database, or whatever. This method is not a default method, so should be implemented by the server.

GET

The get method is used to ask for a resource from a serve. This method can’t send data to the server. The answer is a text which could be in different formats, like HTML, JSON and so on.

To make the request more specific, you can send some information about what you are looking up. This values are append to the URL, as a key value, separate each one with the ampersand & symbol and after the close question mark of the URL.

In most cases, is enough to append the request values to the URL. But there are occasions on which it is not enough and you need to use the FORM ENCODING format

user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36"
curl -vvv -G http://localhost:8000/test -H "User-Agent: $user_agent" -d query=scope=1

 Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /test?query=scope=1 HTTP/1.1
> Host: localhost:8000
> Accept: */*
> User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
> 
 HTTP 1.0, assume close after body
< HTTP/1.0 404 File not found
< Server: SimpleHTTP/0.6 Python/3.11.4
< Date: Mon, 14 Aug 2023 12:28:11 GMT
< Connection: close
< Content-Type: text/html;charset=utf-8
< Content-Length: 335

HEAD

This method asks for the same headers which will be returned with the get method. It is useful in some contexts as small computer capabilities. Imagine you haven’t RAM available. You, could ask first for the headers of the request, evaluate the content-length property and act according to this value.

OPTIONS

The method OPTIONS retrieves a list of methods availables for a resource. This method is implement by default in most of the servers

curl -X OPTIONS https://example.org -i

HTTP/2 200 
allow: OPTIONS, GET, HEAD, POST
cache-control: max-age=604800
content-type: text/html; charset=UTF-8
date: Wed, 16 Aug 2023 15:06:45 GMT
expires: Wed, 23 Aug 2023 15:06:45 GMT
server: EOS (vny/044F)
content-length: 0

POST

The POST method allows to send data to the server. Normally, this method, produces a change in the state of the server, for example, when a login process ends successfully, the server creates a session for the user and works in a different way. That means that the post method is not indepondent.

POST /test HTTP/1.1
Host: foo.example
Content-Type: application/x-www-form-urlencoded
Content-Length: 27

field1=value1&field2=value2

PUT

With this method, we can create or update an existing resource. It means that the method is indepondent.

PUT /new.html HTTP/1.1
Host: example.com
sontent-type: text/html
Content-length: 16

<p>New File</p>

HTTP CODES

When we ask for a resource using the HTTP protocol, the response has a status code to identify if the answer was successful or not. The codes are grouped in five categories.

1XX INFORMATION RESPONSE

When a request gets a code between 100 and 199, the value of the response is informative

2XX SUCCESS RESPONSE

In this case, the request ends successfully. The most common response is 200 which means OK. But there is a lot of entries.

3XX REDIRECT RESPONSE

With this response, the client should make another request following a new URL because the resource has been moved temporally or permanent

4XX CLIENT ERROR RESPONSE

When the client gets an error code between 400 and 499, the client performs the request with some errors that should be fixed before trying again. These errors could be from a badly formed URL to an incorrect data provided to the server, including trying to access some inexistent resource, the famous 404 Page not found

5XX SERVER ERROR

With the 5XX error, the server informs the client, that there was an error processing the request, but it could be temporally and the error is not caused by a client’s action.

Mozilla documentation status code

THE URL

The URL is an id for a resource on the WWW. The URL is formed by different parts:

The protocol

The protocol specify the communication schema to be used. The most known are:

HTTP:
HTTP:
FTP
FTPS
file: This is not a protocol by itself, it is an instruction for some programs which tell the software that the resource is on the computer.

The domain

The domain is the namespace of the resource. When the client writes a domain, there is a process in which the computer ask for an ip associated with the domain to a DNS Server. That means, that the client could use the domain or the ip elsewhere.

PORT

The server can open many ports as sockets. When the protocol is http, the default port is 80 or 443 if uses https.

The first 2¹⁰ numbers are ports used by the operative system. After that, the user could open and use the port which desired.

This is specially useful when working with live servers such as the one provided by Visual Studio.

The path

The path or location of the resource on the server. Normally, it represents a physical location for a file. But now, it could work as an abstraction to the server.

The query parameters

A list of key value pairs, joined by an ampersand, that follows the path of the URL and separate from it with a closed question marked.

Anchor

The anchor is a key inside the web-page, like a section. This allows the user to go directly to the section which is refereed. The character used to separate the anchor from the rest of the URL is a hashtag, because that, we can refer to it as a hash part.

http://example.org:80/index.html?user=pedro

Characteristics

The main characteristic of the WWW is the capability to connect sites with hyperlinks. This provides an architecture not hierarchy, allowing users to explore the content in a different way connecting different ideas, instead of following a path.

W3C

The World Wide Web Consortium (W3C) develops standards and guidelines to help everyone build a web based on the principles of accessibility, internationalization, privacy, and security.

Standards and Guidelines

The standards are rules developed to help web developers and browsers to render well a web page, helping to make it accessible, with semantic meaning. The most important advantage about this standard is that help to make the site visible on many devices, without many headaches.

HTML

HTML is a markup language (HyperText Markup Language). Is the most basic tool to build websites. With HTML we can define the structure of the page and give meaning to its part. Currently, the html is combined with CSS and JavaScript

Basic Syntax

The syntax of HTML has been evolving in recent years. Now the standard is HTML5 which proved more tag and functionalities.

<p>This is a paragraph</p>

The tags have an opening part which indicates the start of this part and a closed tag end indicates the end of the tag. The main difference between both is the slash character after the less than symbol.

The tag could be wrapped by another tag. One example of that is the use of ordered or unordered lists. The tags ol and ul create it respectively.

<p> Ingredients: </p>
    <ul>
      <li> salt</li>
      <li> 2 eggs </li>
      <li> ... </li>
    </ul>
<p>Steps:</p>
<ol>
  <li> Step 1</li>
  <li> Step 2</li>
  <li> Step 3</li>
<ol>

Attributes

To provide more meaning to an element, the developer could use attributes. They are a key, value pairs which give some additional information or functionalities to an element. There are a lot of predefined attributes like:

class
id
href
value

And so on. Also, the developer could use their own attributes, but is a good practice to prepend it with data- to ensure that doesn’t override the default behavior.

<img src="img/my_image.png" />
<ul data-type="quizz">
  <li data-correct="false"> Answer one</li>
  <li data-correct="false"> Answer two</li>
  <li data-correct="true"> Answer three</li>
</ul>

Semantic tags

As I mentioned, there is a meaning behind some tags. This helps the browser and search engine.

header

Inside the header should be the most important part of the page. With the title, logo and, description.

main

The main part of the site. In this element couldn’t be repeated elements from other pages, likes navigation links, footer or introduction

section

The section wraps related content, it works as a chapter of a book.

article

Inside the section should be at least one article that talks about a single topic formed by paragraphs

nav

The navigation bar. This element contains the links to the other page or section of the web site

aside

The aside tag defines indirectly related content surrounding the main text.

footer

The footer of a document. Typically contains:

authorship information
copyright information
contact information
site map
back to top link
related documents

references:

External links

WCSCHOOL This page has content about different technologies, includes html, css and javascript
mdn documentation Mozzila has an excellent documentation about any content related to the web
can i use? When you are not sure if something is supported in some browser