Some file systems allow even unprintable characters, including Bell, Null, Return and Linefeed, to be part of a filename,[1] although most utilities do not handle them well.
Filenames may include things like a revision or generation number of the file, a numerical sequence number (widely used by digital cameras through the DCF standard), a date and time (widely used by smartphone camera software and for screenshots), or a comment such as the name of a subject or a location or any other text to help identify the file.
For example, on the TOPS-10 and RSTS/E operating systems from Digital Equipment Corporation, files were identified by On the OS/VS1, MVS, and OS/390 operating systems from IBM, a file name was up to 44 characters, consisting of upper case letters, digits, and the period.
Utilities and applications allowed users to specify filenames without trailing spaces and include a dot before the extension.
The attribute bits were moved to a special block of the file including additional information.
It allowed mixed-case long filenames (LFNs), using Unicode characters, in addition to classic "8.3" names.
Numbered file names, on the other hand, do not require that the device has a correctly set internal clock.
In some cases, these lengths apply to the entire file name, as in 44 characters in IBM z/OS.
A particular issue with filesystems that store information in nested directories is that it may be possible to create a file with a complete pathname that exceeds implementation limits, since length checking may apply only to individual parts of the name rather than the entire name.
For example, a Fortran compiler might use the extension FOR for source input file, OBJ for the object output and LST for the listing.
This led to wide adoption of Unicode as a standard for encoding file names, although legacy software might not be Unicode-aware.
Conversion was not possible as most systems did not expose a description of the encoding used for a filename as part of the extended file information.
Nonetheless, some limited interoperability issues remain, such as normalization (equivalence), or the Unicode version in use.
A solution is the Non-normalizing Unicode Composition Awareness used in the Subversion and Apache technical communities.
Uniqueness approach may differ both on the case sensitivity and on the Unicode normalization form such as NFC, NFD.
This means two separate files might be created with the same text filename and a different byte implementation of the filename, such as L"\x00C0.txt" (UTF-16, NFC) (Latin capital A with grave) and L"\x0041\x0300.txt" (UTF-16, NFD) (Latin capital A, grave combining).
[16] Some filesystems, such as FAT prior to the introduction of VFAT, store filenames as upper-case regardless of the letter case used to create them.
For example, a file created with the name "MyName.Txt" or "myname.txt" would be stored with the filename "MYNAME.TXT" (VFAT preserves the letter case).
Some file systems store filenames in the form that they were originally created; these are referred to as case-retentive or case-preserving.
Before Unicode became a de facto standard, file systems mostly used a locale-dependent character set.
By contrast, some new systems permit a filename to be composed of almost any character of the Unicode repertoire, and even some non-Unicode byte sequences.
In Unix-like file systems, the null character[18] and the path separator / are prohibited.
File system utilities and naming conventions on various systems prohibit particular characters from appearing in filenames or make them problematic:[8] Except as otherwise stated, the symbols in the Character column, " and < for example, cannot be used in Windows filenames.
[20] For example, DOS device files:[22] Systems that have these restrictions cause incompatibilities with some other filesystems.
For example, Windows will fail to handle, or raise error reports for, these legal UNIX filenames: aux.c,[23] q"uote"s.txt, or NUL.txt.
NTFS allows each path component (directory or filename) to be 255 characters long [dubious – discuss].
Windows forbids the use of the MS-DOS device names AUX, COM0, ..., COM9, COM¹, ..., COM³, CON, LPT0, ..., LPT9, LPT¹, ..., LPT³, NUL and PRN.