Tuesday, October 18, 2016

Received Development Methodology

Received Development Methodology

Introduction

This is called 'The Received Methodology' because nothing has really been invented here beyond tidying up the process and describing it. It is, in bits and pieces, what some working programmers pass on to one another. It already exists in the wild on its own. It is 'what we do'.
There is some merit in formalizing some of the stages so that work is easier to characterize and carve off and so that developers do not have to make apologies for the real-world fact that things generally are not 'first time right' in the meaningful sense of being 'the best' or surviving in the marketplace without revision.
Things like 'pair programming', 'design patterns', 'RAD', 'Use cases', 'XP' (eXtreme Programming)', 'Agile Programming', test driven design, etc, all have their place within this framework, if only because they *DO* happen and this is a description of 'what is' more than a description of 'what ought to be'. However, since this seems to be what creates the world's working software, it is near what 'ought to be' as well.

Yet Another Methodology

There is plenty of material out there in the world about how software *should* be built. There is plenty of material out there that is a postmortem analysis showing why things went horribly wrong. There are mountains of material showing the redacted version of how successful builders created things. However, except for anecdotes from some exceptionally humble and honest programmers (a rarity, but they exist), there is little available that speaks clearly to how most successful software is actually written in real life.
Despite all the advances over the years, as of 2016, software development still remains something of a black art. Transforming requirements and associated algorithms into software is still labor intensive. It involves much trial and error. The only proven way to develop working non-trivial software is to build, test and repeat. However, simply building a thing and testing it becomes effectively impossible as the size of the system increases. Work must be broken down into manageable pieces. 

Test Driven Development

The only way to minimize (but not eliminate) the impact of increasing size is to become more disciplined in reducing linkages between sections of code so that source objects, 'strings' of those source objects and the subsystems they compose can be tested separately without a complete test of the system.
Testing at each point in development is critical and the sooner it is done, the better. Testing as soon as feasible mitigates the damage caused by failures.
There is a school of thought that all development should be 'test driven' -- you write the code to test the code first. As in other things, this extreme notion suffers from its extremeness. Testing is vital, but it is not the only thing. Coding only to the test can create extremely fragile systems that fail as soon as a new condition develops. In real life, production systems have to be maintained and they change.

Received Methodology

Our formal 'methodology' follows the protocol below and is consistent with industry practice. Even with a formal development method, there is much re-design and rewrite prior to delivering reasonable working software. Any novel, non-trivial software that actually performs a useful function must come as the result of a similar process. This process involves iterations of specification, design, development and testing.
Many have had this experience, but I submit that it is intrinsic to software development generally.
The 'Received methodology' has the following stages:
1) INSPIRE -- Some person or event triggers 'a notion' that software will be built.
    • Identify needs/opportunities
    • Create requirements
2) DESIGN
    • System design
    • String (sub-system) design.
3) CODE
    • Unit Design, build and test
    • String (multiple units) Design, build and test.
4) BUILD
    • String Assembly
    • Full System Build
5) TEST
    • System Test (all works together)
    • User acceptance Test (System is acceptable to client)
6) RELEASE
    • Release to Pilot
    • Release to production.
Note: after release to production, the system continues to undergo scrutiny, retest, redesign and reassessment for viability. Eventually, it is retired using the exact same cycle that was used to create it.

Flow Diagram


For each stage and sub-stage in the process, the existing work product is tested and sent back if it fails the test. Its status in the prior stage is examined and if the defect appears to be there, the work restarts there. If the defect does not appear to be there, then it is sent back a further step. This goes all the way back to the beginning and if the original rationale does not hold, the system is retired.
The cycles happen in repeated iterations many times. It is actually possible for something at the end of 'RELEASE' (actually deployed in the field) to get sent all the way back to 'INSPIRE'. That means one could build an entire system only to scrap it shortly after release into production. Does that happen in the real world? Yes. It may happen as much as 25% of the time or more with big systems. [Seehttp://www.cis.gsu.edu/~mmoore/CIS3300/handouts/SciAmSept1994.html]
What is documented here simply reflects what actually happens on projects that deliver working systems into production. If you work without such a plan in mind, you will be constantly surprised. Timelines will surprise you; budgets will surprise you. You end up proceeding more or less as above anyway, or you will not deliver a proper working system.

Documentation

Communications documents (letters, etc.), archival sources and test data and working software embody the best record of its own design and development. To that end, the use of a revision control system is an excellent addition to any development process and crucial to large ones.

Revision Control

A revision control system can help at every stage by capturing the actions that were taken, by showing the context within which they were taken and showing why those actions were taken. In the graphic above, just about every polygon should generate some kind of material that goes into revision control. In fact, if one were using a formal workflow system, even the lines would generate some storage item. It is implicit in the process that there must actually be tests and supporting material for how to do the tests, what was actually done and what happened.

Releases and Version Numbers

Version numbers should be assigned early in the development cycle. Any sever-able component (such as a library) should have a version number and the version number should be appropriately available for inspection.
Version numbers convey important information and by popular usage, different portions of the version number have different names and they signify slightly different things.
Although details vary, software developers use conventions similar to the ones we use:
The version number looks like this:
MM.mm.rr.bbbb
MM -- Major Version
The Major Version is for significantly different releases. To some extent, it denotes a different software product. For instance, the real version of a Windows Server 2003 system might be:
5.2.3790
The version of a Windows 7 system might be:
6.1.7601
Windows Server 2003 and Windows 7 are significantly different bodies of software. The Windows 7 version number follows on from Server 2003 as the next higher number, but even though the software product family is the same, the software is different and many things require alteration to move from one to the next.
mm -- Minor Version
The Minor Version is for different releases. A move from 1.20 to 1.30 could be very significant, but the 1.30 software would be expected to be a super-set of the 1.20. It should break little or nothing, be backward compatible, largely interchangeable etc. A version move from 1.01 to 1.02 would be less significant, though a change in the minor version number generally means that some kind of significant change has happened or something of interest has been added.
rr -- Release
The release number says what release set this belongs to. Every time that a newly compiled version is put out into the wild, the release number should be incremented. In practice, this number is moved from something like 1.00.10 to 1.00.20 when the release changes something noticeable and from 1.00.10 to 1.00.11 when the difference is very small.
bbbb -- Build
The build number should be incremented whenever an executable is produced and will be distributed beyond the desk of the developer. Even though the code base may be identical, it is possible that different compiler switches were used or slightly different libraries were used. The build number allows the identification of a byte for byte identical executable file or system.
Presentation of Version Numbers
Version numbers are typically presented on splash screens, in documentation, on 'usage' screens etc. They are displayed like this:
Release: 0.00.00, Wed Aug 11 21:01:00 2009
Build is generally omitted, but it should be available somehow to developers and support personnel. A time stamp helps to further identify the item if the build number is omitted or fails to increment for some reason.
In General, the version number should be displayed in as much detail as required by the typical viewer. Your software disk might only say 'Version 2'. Documentation might only say 'Version 2.01'. Help screens should display release to further narrow it down. Support personnel and developers might need the build number and even the time stamp.

History and Rationale

This simple overview of the development process took many years to refine. Similarities of the stages reflect genuine similarities, but each item is subtly different as well. Other documented methodologies still either conflate different things or fail to make similarities apparent.
Although this is a strict formal methodology, it is one that reflects the 'messy' business of software development realistically. Software is not always inspired by something reasonable. It is not always a good idea, but will often proceed anyway. During early analysis, many things get changed. What may have been a poor idea can morph into a good one. Some projects that were ill-conceived at the outset will continue to move forward, even though most of the people in the trenches already know it is doomed. Our methodology is designed to minimize that from happening. Other methodologies have a built-in resistance to returning to prior stages. In some, the initial ill-conceived plans and designs are contracts and they doom the project to failure by disallowing fundamental corrections at a later stage.

Design Reality

At each stage moving forward, things start in a very immature state, have sketchy documentation (or none) and they change at each step. In the real word, systems rarely conform to the original design documents and in fact, they generally do not even conform to the final design documents once they are finished. Nearly all real-world documentation for deployed working systems is out of date. That is because unless there is an error in later stages, there is no will (or budget) to revisit the earlier stage.

Defects, Opportunities and Dangers

Development becomes more and more defect driven as time goes on. This is because, as the system grows, it begins to 'canalize' into the shape it takes on. It becomes more difficult to change earlier decisions as more and more later things depend upon those. There are, however, exceptions to this. Sometimes, something will become apparent at a late stage that either shows a much more elegant solution that is worth pursuing or (more often) a design defect as a result of an incorrect assumption will force redesign.
The author once worked on a project where the team had decided to implement using C++ templates when the compiler was slow to build using them. It eventually got to the point where the system effectively could not be built. The decision to use Templates like that was abandoned at a late stage and it required a significant effort to recover.

Ongoing Revision

The fact is, whether they want to admit it or not, developers of working systems have a lot of false starts. They return to the same ground many times in iterations. They use whatever they have at their disposal to get the job done. That sometimes involves stopping to build tools. It sometimes involves extensive manual efforts and it always requires ongoing revision.

Difference from the Waterfall

Note that although this appears similar to the Waterfall Methodology, it differs significantly. The Waterfall Methodology does not work on any reasonable scale requiring rapid time to market. On large projects the waterfall effectively guarantees failure. The reason the Waterfall fails is that it does not iterate over the problem. Things are specified much sooner than is reasonable. By the time certain things are properly known, it is too late to allocate resources.
In the Waterfall, the stages fall through inexorably to the bottom. In our methodology, they could go back all the way back to the beginning and stop at any time. The goal is to reach pilot as soon as you can. The Waterfall lives or dies at the end and because of that, it generally dies. It is not likely that there has ever been a successful non-trivial project that was a true Waterfall.
The 'Received Methodology' is how the author has observed *successful* projects deliver software. These projects have ranged from smallish one coder 30K line products on up to quarter billion dollar projects. You could use different ways to name stages, accomplish cleavage, etc. However, at the end of the day, the above is generally true to what actually happens.

Hallmarks of the Process

Here are some hallmarks of our Methodology:

Numerous rapid releases.

These will generally stabilize into a proper release schedule as the software matures.

Clear versions

Because you have so many versions in the pipeline, you are forced to be able to identify the versions. Especially with mature software, major revisions are crucial. Software that is 'facing the net' and has security implications might undergo rapid small releases and patches on an ongoing basis. However, the more care taken with earlier stages, the more stable the software is likely to be.

Reproducible builds

The build process is a separate defined set of steps. On larger projects, there are deliverable documents and scripts associated with the build process itself.

Reusable code

The process lends itself to identification and reflection upon similarities between sub-assemblies. As a matter of course, because good programmers are ultimately lazy, things will tend to be reused and as they are reused they will be refined *for* reuse.

General symmetry

Because you revisit the same stuff so often, you massage them on the fly to leverage similarities. This goes beyond reusable code in libraries. It affects all parts of the process.

Reasonable backward compatibility

When the coders have to live with incompatible versions, they start getting more careful about backward compatibility. Good coders are 'lazy'. As things become moved out to libraries, they have a tendency to be more carefully designed and built. Because they are used by more than one calling system, they *must* maintain greater compatibility than single use code.

Finer granularity of tasks

Because you have to validate and prove at each step, you have a tendency to bite off smaller chunks.

Very good snap and fit interfaces

You only have to live with misunderstandings about interfaces a few times before you get more fastidious about clarifying interfaces up front. As with many of the aspects of this methodology, this is largely due to the fact that things get revisited.

Variable name length proportional to size of scope

This is a good habit that has a tendency to develop as senior coders migrate code out to libraries. The standard variable 'i' as an index counter makes good sense in a narrow local context. It makes code easier to scan and understand and its narrow scope makes misunderstanding its purpose unlikely. As code migrates out to libraries and/or gets bound to objects, the names must get longer to avoid collisions and to make them easier to understand in larger contexts.

Greater use of encapsulation

This has a tendency to happen on its own as people get bitten by global variables and objects. On revisit they realize they can squeeze scope. A cheap and easy way to contain the scope of an object is to contain the object in a parent object.

Compliance to User Demands

This process leads to much greater compliance to the *eventual* wishes of users. Everything is seen in its many manifest forms many times. That includes UAT, Pilot and release. In *real* software, released production code is generally also being re-worked for subsequent releases. The old Waterfall method shoots straight to the end and everyone pats themselves on the back and goes home. In the 'real world' method above, the culprits are still around when things go wrong (as they usually will in early releases).

Smoother Implementation

Repetition means that this process leads to better attention to roll-out and greater automation of upgrades. It also leads to better fallback/rollback strategies because numerous releases will cause rollback situations to occur in real life.

Better Code Base

As the feedback goes back into code/unit test, lessons learned help to enforce good practices (or maintain them because they don't 'die' by being removed from code).
On a larger system at least, you will see more elaborate logging and more formalized unit tests. This is because as the ground gets revisited, specs get tightened up and programmers are lazy. Rather than stepping through debuggers, examining core dumps and putting in ad-hoc print statements, they will 'automate' the tracing process. They will also automate the unit tests after a while. Installation, upgrade and repair routines will be better because there is more incentive to build them and make sure they work -- they get used a lot.
Niceties like backups and their corresponding restore routines are more likely to exist, be tested and work. There are a frightening number of systems unable to save and restore state out there. When you have to keep doing it for test, it has a tendency to work.
Error messages are more verbose, more meaningful and more helpful. After a few go-rounds of having to find and explore 'unexpected error' in ten different places, you have a tendency to make sure the message itself tells you what was happening and where.
You see fewer things like 'assert(impossible != TRUE)' statements and more validation and graceful error recovery. After dealing with unhelpful 'I died on the impossible and can't really tell you why' console messages reported in various different ways from different users, you have a tendency to tighten that up in self defense.

Working Software

The real hallmark of such a methodology, though, is that it produces working software. Think about every single piece of software you use from operating systems through compilers to end user office software. Chances are good that if you are using it, it is on its third, fourth, fifth or higher major version. That means, in most cases, it went back to the beginning and started all over again. In practice, if you speak to honest programmers, you will find that there was tons of stuff you never even saw. Things tried and failed and sent back in stages that never made it to the light of day. The entire thing is something of an 'organic' process.

Proven Design

It is, in practice, impossible to spec non-trivial software, in detail, up front. Until you build and deploy something, you really don't know for sure what it is you are actually trying to do. Non-trivial systems do a lot of stuff. It depends and depends and depends on so many things, it might well be unpredictable even in theory. That is, the empirical exercise of building and trying the software may be the only way to build non-trivial software that works reasonably.

How It Is Really Done

What you *can* say about software is that there was a reason it was started. Somebody at least took a stab at justifying and characterizing it, some sort of design took place (if even just on a napkin), some sort of sub-system design took place (if only on another napkin), somebody coded little things and put them together into larger things, somebody built the components and then used them to build the whole, somebody tested it in some fashion, users passed some sort of judgment on it, it was released to a small pool at first and a larger one later. In general, those parts build upon one another, so they happen best in sequence. You can do them out of sequence a little, but to the extent you do, it usually does not work well for you. At each stage, there is always at least a little fall back to a prior stage. If done well, the fall-back is anticipated or even welcomed as an opportunity to fix things at the stage where the fix has the best impact and is less costly.
Somebody very good, with a lot of experience, working on a small tool might complete the whole thing above in a day or two. They could 'sort of' skip steps by doing stuff in their heads or going with a 'big bang' test of the entire system and skip unit test. However, that is still likely sub-optimal and only a very skilled programmer could pull it off. [As an aside, a lot of problems come from very talented journeyman programmers whose short-term memory is at a peak. They write code with brain-buster routines and macros that even they themselves are unable to understand at a later date. If there are bugs (it is amazing how programmer's continue to delude themselves that they don't make mistakes), they are sometimes more time consuming to find and eliminate than it is to simply rewrite properly (like, less brain-busterly) from scratch).]
Regardless of the skill of the team, if the project is a large one (even 100K lines), it will conform to the steps above, pretty much in that sequence. To the extent that it deviates, it will produce an inferior product dollar for dollar, and in the real world, they generally just fail outright.
In building UNICES of all flavours, compilers, languages like PHP, Perl, C/C++, Java, other operating systems (the author was involved with beta testing and building applications with OS/2 early on, for instance), banking production systems, telecommunications production systems, small department specialty applications, vertical market applications, etc. This is how it happens. That may not have been the plan, but that is what actually happened.

Versus Other Methodologies

There are many 'post hoc' accounts that speak to a different method of development, or perhaps to more certainty going forward. Despite the nice reports showing steady forward progress on, for instance, ISO 9000 projects, the situation 'in the pit' is decidedly different. Very large projects that follow a stiffer process where going back is vigorously constrained are cruelly hobbled. They fare badly when released to production. The developers get paid, the users sign off and it goes into production, but it is *not* pretty.
In the early days of OOP, the author reviewed a $250 million dollar OOP project using Rational Rose and whatever OOP project management regime was in vogue at the time. More than one major vendor was involved in this project. Fifty million dollars into the project, the author asked everyone involved (the entire project was reviewed) to demonstrate the work they had in hand. The developers, in particular, with the best tools available, after consuming tens of millions of dollars were unable to demonstrate a single working subroutine. They could not fire up even one single thing and show it do anything on a screen. They had nothing they could show working at all. This was on NextStep x86 boxes and the GUI was a major part of the deliverable. They could not display a single dialog after spending 50 million dollars. Just about ten million dollars later, as the review recommended, they canceled the project.
Demonstrable Results
The 'Received Methodology' produces demonstrable results *well* before consuming a million dollars. The OOP project above was doomed at the outset. However, had they followed the working methodology above, they would have discovered it was doomed long before consuming $60 million dollars. The methodology above, allows or even demands that the INSPIRATION remain in place because it is constantly being revisited.
Early Warnings
If you follow the method described, you discover at the earliest possible opportunity if something fails. You may not know for sure that the thing that 'passes' will go all the way to production and stay there, but you know for sure that the thing that fails is not going anywhere.
There will always be some 'gotchas' in production. Even mature systems get nailed in production. That means they have to go back to whatever point is required to fix the problem and then rolled forward. Rolling forward is much easier if you have proper design, build, test and deploy regimes in place. If you follow the method above, you will as a matter of course. A passed regression test does not guarantee success, but a failed unit test guarantees that you either fix that or at least know it is an 'issue' at release.
It Works!
What is represented above is the distillation of 'what is done that works'. One need only examine the history of major software and source code repositories of working software to see that this is basically what happens. To the extent that artifacts are left behind, they support the notion that the stages above took place. Things like FireFox are the culmination of such a process over many years starting with Mosaic and its antecedents.

Inspiration

One thing in particular, that is different about the above is the primacy of the 'INSPIRE' stage. This is arguably the most misunderstood, underrated and misrepresented stages of software development.
Years before even Mosaic (predecessor of the predecessor of FireFox) was built, Tim Berners-Lee provided and kept alive the 'inspiration' that led to the World Wide Web and the birth of the modern browser. It began with hypertext as a solution to the problem of navigating through information. A working hypertext system existed for many years before Mosaic was given the go-ahead.
It appears from the record (and makes sense) that the inspiration, the analysis/rationale and creation of requirements for modern browsers began a very long time ago and culminated in a series of iterations through to Mosaic. By Mosaic, it is fairly clear that the browser was a finished product. However, it also fed right back to the beginning of the cycle. That feedback culminated in Netscape, which inspired Internet Explorer and the race has continued from there. FireFox has a significantly more sophisticated development and release cycle than the original text only hypertext browsers.

Iteration Creates the Product

As the iterative cycles unfold over time, the various formal aspects 'tighten up'. Initial rationale and requirements may be quite ethereal. It often happens (in corporate environments at least) that there is more of a 'vague notion' than an inspiration and more of a drive to justify after the fact rather than a true rationale and requirements are driven more to justify moving ahead than to really describe what is being done. As things move forward through prototypes, feedback to earlier stages, etc, the whole thing 'gels' (or does not and dies). Something in production successful enough to live another day will go back to have better specification, a deeper and more meaningful rationale, an inspiration bolstered (and clarified) by initial success and will have stages 'tighten' up as the repetition of tasks spurs forward greater specification, automation and understanding.

The Methodology

INSPIRE

Some person or event triggers 'a notion' that software will be built.
Identify needs/opportunities
In this initial stage, the reasons for undertaking the project are defined. This is revisited to refine particulars. Learning from later stages feeds back into this. New opportunities and needs are added and some may be removed.
One need or opportunity might be some function that the world needs for which there is no other product or existing products are inferior. Another might be a requirement imposed from beyond the reach of the developers. For instance, a new legal requirement could come into effect that necessitated a change to an existing system.
It could be that the 'need/opportunity' is a perceived marketing advantage.
Another reason might be to improve an existing process to reduce costs.
It is possible that upon reflection the needs and/or opportunities are not substantial enough to continue and the project is canceled.
Create requirements
This stage crystallizes the needs and opportunities into specific measures for success. At this stage, some things may be identified that could force things back to a previous stage. For instance, it may turn out that the requirements entail things that simply cannot be done and still have net benefits.

DESIGN

In the design stage, the characteristics of the software are defined so that they meet requirements.
System design
System design defines how the overall behavior of the system meets requirements. At this stage, it might be determined that design is not feasible or requirements are missing. It would then get sent back to refine the requirements so they are 'designable'.
At the system design phase it may also become apparent that there are new opportunities identified and it gets sent back to refine the requirements for these. For instance, designers may discover early on that meeting the requirements can be done cheaper by simply adopting code from another system. They might also find that with a small extra effort they can replace an older system as well.
String (sub-system) design.
String design defines the components that will make up the system. It may turn out at this stage that two things are similar enough that it is worthwhile to go back to system design to fuse them. It may turn out that the system design requires subsystems that cannot be done due to conflicts, costs, deficiencies in an operating system, communications delays, etc.

CODE

Once a subsystem is designed, it is passed on to coding. At this stage, it is in the hands of programmers who design and create the working software.
Unit Design, build and test
For coding, the 'strings' are decomposed into smaller units for coding.
String (multiple units) Design, build and test.
Once units are built and pass unit test, they are assembled and wired together into the complete sub-systems ('strings'). Note that on very large systems, there may be a finer granularity.

BUILD

Once strings pass testing they are passed to the team that builds the system.
String Assembly
At String Assembly, the subsystems are built. [Note that the build can involve many underlying build processes and subsystems may be nested].
At this stage, it may be found that despite passing string test in the development environment some aspect of the build environment (closer to production) makes building impossible and things have to be passed back to coding. For instance, something that sometimes happens on large projects is that the 'ground up' process of building strings takes too long to be practical. Another thing can happen during development on large systems is that version dependencies fail in the build environment because the developers are using later versions.
The 'version dependency' issue is surprisingly wide spread and affects even commercially released software because interfaces that *should* be entirely cast in stone get altered by people unfamiliar with downstream backward compatibility requirements.
Full System Build
The System Build creates the system as it will be installed. As with the String Assembly, it is possible that some aspect of the overall system makes build impossible or sub-optimal. If so, it goes back to string assembly. If it is simply an issue such as compiler switches, it can be corrected at String Assembly. Otherwise, it goes back to the 'Create' phase.

TEST

Once a complete system is built, it is passed on to testing. On large projects, this is a done by a separate team, running in a separate environment that more closely mimics the target environment(s).
System Test
In system test, all the built components are tested as a system. Ideally, at least portions of the testing will be automated and the testing will do complete coverage of the pathways the program can take. In practice, this is not feasible with large software systems.
To make certain that the expected functions work correctly, the testing team should have test case scenarios set up that provide adequate coverage of the code.
At System Test, many things can be covered, such as stress-testing, testing for memory leaks, backup and recovery, disaster recovery and data integrity.
At this point, it should not be much of an issue, but system test should review and sign off on 'fit and finish'. This means that startup happens as expected, that messages can be read, that widgets are properly sized and lined-up, that margins and decorations look clean and professional, that things like icons look and act correctly, that various resources such as help-files are in place, etc.
System test is the last opportunity for the development team to review and correct their work before users see it.
By System test, on large systems, it is costly to stop and go back to correct errors. Errors that are caught are documented and the 'errata' are passed on to UAT. Only 'show-stopper' bugs are corrected and these corrections may be temporary fixes to allow the system to move forward.
All errors feed back into the earlier stages of the Methodology. However, depending upon the cost/benefit of correcting errors, some errors may be documented as issues but never corrected.
Some things may be deferred for a later release.
User acceptance Test (UAT)
By UAT, the system should be in a state that looks good, operates smoothly, meets requirements adequately and has any errata noted. It should be a completely working system with all elements at the appropriate level. For instance, if this release is associated with a new help file then the correct file should be in place.
In UAT, users run in an environment nearly identical to the production environment.
In UAT, users run things like 'use cases' to ensure that the system meets core functionality. Many bugs that should have been caught earlier get caught in the first few UAT tests. This is because developers tend to enter their data and operate the program in expected ways. Real users have a tendency to make many false starts, leave important fields blank, enter the wrong kind of data, etc.
It would (unless it is a test case) be unusual for the development team to attempt to paste an entire spreadsheet into a text field or huge binary files into a memo field. Similarly, if the *requirements* of the software state that a certain service must be available or active, the development team tends not to test the (somewhat pathological) case where those things are not done or done incorrectly.
End users will do all manner of things that are outside of expectations and can reveal critical bugs. It can be logged as an errata that a program will crash if you *incorrectly* perform certain actions. However, if the program silently fails and ends up corrupting data, then it is a serious bug that must be corrected.
As in the system test, problems found in UAT will tend to be documented as issues, but the system will proceed to release anyway.

RELEASE

At release, the program heads out into the wider world to be used, 'for real', by real users. Just as UAT reveals some unexpected things, so too does release.
Release to Pilot
For large and critical systems, especially when they are new, they should go to a 'pilot' release prior to 'rollout' into production. In pilot testing, systems are typically exposed to a greater numbers of users than the UAT team. They are also exposed to environments that are less well characterized. Some systems might only test under Windows XP because that is the 'spec' for the operating system. However, at pilot you can expect that things like that will be slightly out of spec. Users may be running other operating systems and it is at this point you may discover that even though the system 'should' run under Windows Server machines, it simply does not.
For the most part, pilot issues will simply be documented as issues to address with another release. However, certain critical things like security fixes may be made available as 'hot-fixes'
Release to production
Once the system has been accepted in the pilot release, it goes to 'release proper'. Some few things may change such as additional errata, additional programs or hot fixes to deal with critical problems, etc.
By Release the program should be sound and suitable for production, even if it still has some issues.
As errata, new requirements and new opportunities present themselves, they feed back into the development cycle preparatory to a new release.
Issues such as security fixes and critical bug patches are typically released as hot fixes. Occasionally, depending upon the nature of the system, full updates will be supplied to bring the system up to a new minor version.
Data should remain intact between minor releases. Data should be migrated, if necessary and possible, between major releases.
Except for major releases, and generally even with major releases, backward compatibility should be maintained whenever possible.

Note -- this is a working draft that is changing as you read this.  "First, LLMs do have robust internal representations. Second, there...