Monday, February 4, 2019

Standard Directory System

STD -- Standard Directory

Overview

The 'Standard Directory System' has been in use in various forms for more than 30 years.The principal aim of the system was to avoid collisions. That is, it was intended to allow files that are different from one another to never overwrite one another. This was accomplished by having conventions for the names that are used for various common programs.As the system has developed, the system is made to organize things so that they are easier to find. This is accomplished by having conventions as above for particular programs and general conventions for particular types of programs and conventions for standardized directory structures. The standard anticipates various types of common requirements and assigns standard names to those requirements.

Standard Directory Names

Where possible, conventions that already exist have been followed. In some instances, such as the 'usr' directory under UNIX, it has been decided to deviate from the convention so that collisions are avoided. In other instances, such as the 'home' directory, the convention has been followed to allow easy transition back and forth between the systems.

General Conventions

Three letter name sizes

As a rule, directory names are only three letters long. There are several reasons for this: People generally find 'TLAs' easier to remember. By making most 'standard' names the same size, other directories are easy to spot in a listing Organizing large disks can require very deep structures. Using long names often leads to over-runs in the size of the name. This is particularly true for various target media. For instance, some media limit the total path length to only 63 characters. ISO 9660 CD-ROMs have a maximum path length of 207 characters. By making the standard names only three characters long, we free up more space for other software that does not follow the standards. For instance, here is a real pathname as constructed by other software and its equivalent as a 'standard directory name':
  1. C:\Program Files\Microsoft Small Business\Small Business Accounting Addins\Fixed Asset Manager\Templates
  2. C:\std\app\mcs\msb\sba\fam\tpl
Example 1 is 104 characters long. Example 2 is only 30. Should there arise an occasion (as there does in the real world) where one would want to copy a backup with the full directory structure into the bottom of the path, it would exceed the length allowable on most CD-ROM's and it would not be possible to copy it there. Worse, many file systems will allow you to write a path that exceeds their limits and then not allow you to read or delete that path.Experience with real world systems where entire structures must be written deep into other directory structures shows that problems arise quickly if very long names are used. Since we can't control the use of long names in other systems, we are doubly obliged to keep our own usage to a minimum.

Simple Names

The standard specifies that names should be, as much as possible, formed according to simple 'lowest common denominator' rules. This is in keeping with the principal that we are attempting to avoid 'collisions'. In this case, the collisions are with the conventions of a given operating system. That means: No spaces, use underscores if needed. No characters that conflict with operating system shells -- example: '(', ')', '[', etc. Generally use lower case letters to remain compatible with case-sensitive systems such as HTTP. If practical, there is still a bias toward 8.3 DOS style file names, especially with files that may travel to many operating environments -- example: readme.txt As a general rule, the name should be able to retain its exact structure across all environments where it is likely to reside.

Some specific names

Root directory

The root directory currently in use is 'std'. This is not particularly important and for anyone using the system outside of ones created by us, any three-letter root that does not collide with conventional directories would be fine.Standard Sub-Directory NamesThis is not an exhaustive list and it changes from time to time as standards change.std -- Root Directory Root directory as currently used. This can reside on any drive or in the case of UNIX would hang off the regular root. Example: C:\std\
  • app -- Applications
  • app -- application directories
  • arc -- Archives
  • arc -- Archive Files -- Not quite the same as the backup directory. This directory merely contains archives of files that are taking up space but still may be needed from time to time. It is also a 'scratch area' to allow archiving temporarily to free up disk space.
  • bin -- Executable Files
  • bin -- binary executables and scripts.
  • bkp -- Backup Files
  • bkp -- Backup files and related files. These are intended for actual backups. What type of backup resides here depends upon the location of the directory.
  • cyg -- Cygwin
  • cyg -- Under Windows environments reserved for Cygwin.
  • dvl -- Development
  • dvl -- Software Development directories.
  • doc -- Documents
  • doc -- Documents -- *.doc, *.txt, *.xls, *.ppt, etc.
  • home -- User Home Directories
  • * home -- home directory -- retains UNIX convention for home directory.
  • hme -- home directory -- home is deprecated. By making user homes different from both Windows Logins and UNIX logins, it makes the system more 'OS agnostic'. It also does not suffer from the clutter put in home directories by well-meaning admins. Copying /std/hme/nzt/doc from one system to another should not interfere with anyone else's conventions.There is also the matter of symmetry. Branches should generally be three letters. That seems to work best in practice and it makes directory trees easier to visualize, less tedious to type, etc. Leaves can (and arguably should) take longer names thus creating: /std/hme/rst/dvl/vb6/tst/TestTrackBall would be perfectly valid. The path above breaks down as :
  • std -- Standard root hme -- Home directories rst -- Robert Stephen Trower's home directory. dvl -- development vb6 -- vb6 specific development tst -- test programs TestTrackBall -- Working directory to Test TrackBall code.
  • inc -- Include Directories
  • inc -- include directory -- contains various types of include file. This is generally different from the C-Language standard 'include' directory.
  • lib -- Library Directories
  • lib -- Libraries. This typically contains *.lib or *.o or *.obj files.
  • std -- std 'reflection' directory
  • std -- Under C:\std, for instance, this would be C:\std\std. It is used as a 'reflection' of a standard structure that might, for instance be the subject of a network share.
  • svn -- Subversion
  • svn -- Subversion Revision System
  • tmp -- Temp files
  • tmp -- Temp files. This is especially used by things such as session variable.
  • trn -- Transfer files
  • trn -- Transfer directory. When files are being transferred into or out of the system, this is where they reside during transfer.
  • web -- WWW files
  • web -- Root of things related to a local web server.
  • wrk -- Work/Scratch files
  • wrk -- Many things done on a file system involve things that are experimental, have a limited lifetime or require some thought to create a permanent home. The 'wrk' directory is designed to give a quick area to do work that will not interfere with the rest of the system.

Standard Path Variable

The 'standard path' is formed so that the commands available depend upon your context within the file system. For instance, there may be utilities specific to one set of files that do not apply to another. This ensures that when appropriate, the commands are on your path, but does not require your path to include everything on the disk.
The standard path usually contains something similar to the following:

Standard Paths

Absolute Paths

Personal Command directory (over-rides all) C:\std\home\myname\bin; Global command directory; C:\std\bin; Root command directory \bin;

Relative Paths

local command directory bin; Parent command directory ..\bin; grand-parent command directory ..\..\bin; great-grand-parent command directory ..\..\..\bin;The above forms a path similar to this: path=C:\std\home\myname\bin;C:\std\bin;\bin;bin;..\bin;..\..\bin;..\..\..\bin;

Operating System Paths

Other paths are required as well, such as: Adjunct OS path (Cygwin) C:\std\cyg\bin; The operating system path: C:\WINDOWS;\C:\WINDOS\SYSTEM32;It may also be convenient for certain things to be usable globally from the command line:Application Paths Application path C:\std\svn\bin;

Putting things together in the intended order gives a path that looks like this:path=C:\std\home\myname\bin;C:\std\bin;\bin;bin;..\bin;..\..\bin;..\..\..\bin;C:\std\cyg\bin;C:\WINDOWS;\C:\WINDOS\SYSTEM32;C:\std\svn\bin;

As can be seen above, this is already a substantial path. However, it allows a very wide range of commands to be available while working and leaves room for the many applications that add their own paths during installation. Because many of the commands are on relative paths, the search for a command is much faster than it would be if all application commands were on the path at once. The bin;..\bin;..\..\bin; relative path construct also allows the placement of commands to show a finer granularity and greater specificity to the task at hand.

By placing everything that pertains to a given system within its own directory structure, backup, removal and restoration of backups is simple and has a minimal impact on the rest of the system.

Relative paths also allow the simple relocation of directories and also allow the creation of completely operational copies of the directory in other places. It is possible, therefore, to try a radical global change on a group of directories on a copy rather than the original.

No comments:

Note -- this is a working draft that is changing as you read this.  "First, LLMs do have robust internal representations. Second, there...