How anchor <a> work

I googled for a MageOverflow post how anchor work, so I don't have to write it down myself. But I couldn't find any.

If you only want to know, how a anchor (JS/CSS script link) needs to be built to work, scroll down.

So I'll do it myself, please send me links on Twitter or via mail if you find a better tutorial, so I can link it here.

Parts of a URI

As you can read on the parse_url doc every URI consists of the following parts:

  • scheme
  • host
  • port
  • user
  • pass
  • path
  • query
  • fragment

Some of them can be omit, but they exists, even if they are empty.

                    hierarchical part
        ┌───────────────────┴─────────────────────┐
                    authority               path
        ┌───────────────┴───────────────┐┌───┴────┐
  abc://username:password@example.com:123/path/data?key=value#fragid1
  └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └───┬───┘ └──┬──┘
scheme  user information     host     port            query   fragment

  urn:example:mammal:monotreme:echidna
  └┬┘ └──────────────┬───────────────┘
scheme              path

Thankfully borrowed from wikipedia

Schema

There are a lot of schemes, like file://, tcp://, tel://, mailto:.

Username and password

Usernames are used for example in ftp links. You can use them also in http links, to submit them to a .htaccess auth.

Hostname

The hostname is often a domain, like magento.com, but it can use an IP address too: 66.211.190.110.

Port

Many schemes define default ports:

  • http: 80
  • ftp: 21
  • ssh: 22

So you can omit the port.

Path

The parts before, including port are important to find the server (at least for http), so called authority. Starting with the path, we are on our way through the server to the right place.

We need to distinguish between directories and files, everything ending in a / is a directory, everything else is a file:

/directory/directory2/file
/file

Query

We can add additional informations, so the server knows, what we want

Fragment

And we can link into whatever comes back.

What is the problem with all this?

Default values or omitting parts

For the following I'm only talking about http because the problems which motivate me to write this are all http related.

  • scheme - the same as the page we are on (http(s))
  • host - the same we are on, e.g. magento.com
  • port - 80/443 (or the same we are on)
  • user - I think none, but I'm not sure
  • pass - I think none, but I'm not sure
  • path - "/"
  • query - none
  • fragment - none

The following examples assume we are on the page:

http://blog.fabian-blechschmidt.de/how-anchor-work

You can only omit from left to right. This means of you want to omit the port, you have to go without scheme and host too!

Omitting scheme

When we omit the scheme, we use the same, as the page we are on.

If we link to //google.com/ we will use http, because we are currently on an http served page. Please note the two // of the starting url.

But this means especially, we can include (third party) js and css files, based on the current scheme!

Don't use

http://code.jquery.com/jquery-2.1.4.min.js

but instead

//code.jquery.com/jquery-2.1.4.min.js

(or even better
https://code.jquery.com/jquery-2.1.4.min.js)

Omitting host

We can omit the host, it would look like this:

/another-cool-blogpost

Then we assume the scheme is http and the host is blog.fabian-blechschmidt.de. It is important to understand, that this url is absolute. It takes the scheme, the host and the path and builds it together:

http://blog.fabian-blechschmidt.de/another-cool-blogpost

So if we move the blog from blog.fabian-blechschmidt.de to lets say fabian-blechschmidt.de/blog all links will break, because

this is built:
http://fabian-blechschmidt.de/another-cool-blogpost
http://fabian-blechschmidt.de/blog/another-cool-blogpost
and here the blog post lives now

Omit the path

if you want to link relative you need to omit the first /, like so:

another-cool-blogpost

Here it is important to understand, that the relative link is relative to the current directory we are in. If our blog post urls are "directories", like this:

http://blog.fabian-blechschmidt.de/how-anchor-work/

We have a problem, because the created link appends the new path the the current one:

http://blog.fabian-blechschmidt.de/how-anchor-work/another-cool-blogpost

and this URL is most likely wrong and broken.

As far as I know is there no solution for this problem, except knowing your base url and built your link upon this base url.

We get the same problem with css and js files! If we make them absolute:

/css/styles.css

we can't easily move them into a subdirectory, if we make them relative:

js/script.js

they will break on "deeper" pages, like:

nerd-shirts/people/always-be-marius

Because the script doesn't live in

nerd-shirts/people/js/script.js

Omit the other parts

Port

You can't omit the port (I think). It would look like this:

:123/my-path/my-file    

But this is rendered like a relative path (omitting the path).

User and password

User and password is "removed" by the browser from the current url, so the browser doesn't "know" anymore these details. If you want to use them, do it explicit.

Querys

Querys as fragments are part of a single request, therefore they are not removed from the url as user and password, but still not used in any other url on the page.

Hope this helps. If I have forgotten anything, please let me know: @Fabian_ikono