tar (computing)

The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization.

Today, Unix-like operating systems usually include tools to support tar files, as well as utilities commonly used to compress them, such as xz, gzip, and bzip2.

(The origin of tar's record size appears to be the 512-byte disk sectors used in the Version 7 Unix file system.)

To ensure portability across different architectures with different byte orderings, the information in the header record is encoded in ASCII.

To overcome this limitation, in 2001 star introduced a base-256 coding that is indicated by setting the high-order bit of the leftmost byte of a numeric field.

An exception is older versions of GNU tar, when running on the MASSCOMP RTU (Real Time Unix) operating system, which supported an O_CTG flag to the open() function to request a contiguous file; however, that support was removed from GNU tar version 1.24 onwards.

The following tags are defined by the POSIX standard: In 2001, the Star program became the first tar to support the new format.

This process copies an entire source directory tree including all special files, for example: The tar format continues to be used extensively for open-source software distribution.

*NIX-distributions use it in various source- and binary-package distribution mechanisms, with most software source code made available in compressed tar archives.

[citation needed] The original tar format was created in the early days of Unix, and despite current widespread use, many of its design features are considered dated.

Due to the field size, the original TAR format was unable to store file paths and names in excess of 100 characters.

[25] Many older tar implementations do not record nor restore extended attributes (xattrs) or access-control lists (ACLs).

[26] More recent versions of GNU tar support Linux extended attributes, reimplementing star extensions.

[28] It is at best an inconvenience to the user, who is obliged to identify and delete a number of files interspersed with the directory's other contents.

A related problem is the use of absolute paths or parent directory references when creating tar files.

However, modern versions of FreeBSD and GNU tar do not create or extract absolute paths and parent-directory references by default, unless it is explicitly allowed with the flag -P or the option --absolute-names.

The bsdtar program, which is also available on many operating systems and is the default tar utility on Mac OS X v10.6, also does not follow parent-directory references or symbolic links.

If any are problematic, the user can create a new empty directory and extract the archive into it—or avoid the tar file entirely.

GNU Emacs is also able to open a tar archive and display its contents in a dired buffer.

The tar format was designed without a centralized index or table of content for files and their properties for streaming to tape backup devices.

In turn, this design makes TAR archives resilient against damage from missing portions, in both the form of digital files and physical tape.

[31] A number of "indexed" compressors, which are aware of the tar format, can restore this feature for compressed files.

[33] Another issue with tar format is that it allows several (possibly different) files in archive to have identical paths and filenames.

[39] The decompression of these formats is handled automatically if supported filename extensions are used, and compression is handled automatically using the same filename extensions if the option --auto-compress (short form -a) is passed to an applicable version of GNU tar.

MS-DOS's 8.3 filename limitations resulted in additional conventions for naming compressed tar archives.

Tar archiving is often used together with a compression method, such as gzip , to create a compressed archive. As shown, the combination of the files in the archive is compressed as one unit.