How to create Hard-links and Soft-links in Linux


To totally unlock this section you need to Log-in


Login

You're probably familiar with shortcuts in Microsoft Windows or aliases on the Mac. Linux has something, or actually some things similar, called hard links and symbolic links.

Symbolic links (also called symlinks or softlinks) most resemble Windows shortcuts. They contain a pathname to a target file.

Hard links are a bit different: they are listings that contain information about the file.

Linux files don't actually live in directories. They are assigned an inode number, which Linux uses to locate files.

So a file can have multiple hardlinks, appearing in multiple directories, but isn't deleted until there are no remaining hardlinks to it.

Unix/Linux files consist of two parts: the data part and the filename part.

The data part is associated with something called an 'inode'. The inode carries the map of where the data is, the file permissions, etc. for the data.

Hard-links and Soft-links in Linux

Hard-links and Soft-links in Linux

The filename part carries a name and an associated inode number:

Hard-links and Soft-links in Linux

More than one filename can reference the same inode number; these files are said to be "hard linked" together:

Hard-links and Soft-links in Linux

Soft-links

On the other hand, there's a special file type whose data part carries a path to another file. Since it is a special file, the OS recognizes the data as a path, and redirects opens, reads, and writes so that, instead of accessing the data within the special file, they access the data in the file named by the data in the special file.

This special file is called a 'soft link' or a symbolic link (aka a symlink).

Hard-links and Soft-links in Linux

Now, the filename part of the file is stored in a special file of its own along with the filename parts of other files; this special file is called a directory. The directory, as a file, is just an array of filename parts of other files.

When a directory is built, it is initially populated with the filename parts of two special files: the '.' and '..' files. The filename part for the '.' file is populated with the inode# of the directory file in which the entry has been made; '.' is a hardlink to the file that implements the current directory.

The filename part for the '..' file is populated with the inode# of the directory file that contains the filename part of the current directory file. '..' is a hardlink to the file that implements the immediate parent of the current directory.

The ln command knows how to build hardlinks and softlinks; the mkdir command knows how to build directories (the OS takes care of the above hardlinks).

There are restrictions on what can be hardlinked (both links must reside on the same filesystem, the source file must exist, etc.) that are not applicable to softlinks (source and target can be on seperate file systems, source does not have to exist, etc.).

On the other hand, softlinks have other restrictions not shared by hardlinks (additional I/O necessary to complete file access, additional storage taken up by softlink file's data, etc.).

In other words, there's tradeoffs with each.

ln in action

Let's start off with an empty directory, and create a file in it:

Hard-links and Soft-links in Linux

Now, let's make a hardlink to the file just created (basic.file):

Hard-links and Soft-links in Linux

We see that:

  • hardlink.file shares the same inode (73478) as basic.file
  • hardlink.file shares the same data as basic.file
  • If we change the permissions on basic.file:

    Hard-links and Soft-links in Linux

    Then the same permissions change on hardlink.file. The two files (basic.file and hardlink.file) share the same inode and data, but have different file names.

    Let's now make a softlink to the original file:

    Hard-links and Soft-links in Linux

    Here, we see that although softlink.file accesses the same data as basic.file and hardlink.file, it does not share the same inode (73479 vs. 73478), nor does it exhibit the same file permissions. It does show a new permission bit: the 'l' (softlink) bit.

    The link has different permissions than the original file because it is just a symbolic link. Its real content is just a string pointing to the original file.

    If we delete basic.file:

    Hard-links and Soft-links in Linux

    Then we lose the ability to access the linked data through the softlink:

    Hard-links and Soft-links in Linux

    However, we still have access to the original data through the hardlink:

    Hard-links and Soft-links in Linux

    You will notice that when we deleted the original file, the hardlink didn't vanish. Similarly, if we had deleted the softlink, the original file wouldn't have vanished.

    Deleting files

    When deleting files, the data part isn't disposed of until all the filename parts (or hardlinks) have been deleted. There's a count in the inode that indicates how many filenames point to this file, and that count is decremented by 1 each time one of those filenames is deleted. When the count makes it to zero, the inode and its associated data are deleted.

    By the way, the count also reflects how many times the file has been opened without being closed (in other words, how many references to the file are still active). This has some ramifications which aren't obvious at first: you can delete a file so that no "filename" part points to the inode, without releasing the space for the data part of the file, because the file is still open.