Wednesday, October 8, 2008

Happy belated 25th GNU!

The GNU concept was announced more than a quarter of a century ago on a mailing list by Richard Stallman. The date was Sept. 27. I am a few weeks late mentioning it.

In fairness, the founding of the Free Software Foundation did not happen until a couple of years later and I would expect that to be a bigger celebration. We shall see.

I wrote an article about Richard Stallman a number of years ago where I said:

"Here is what Richard has done for us all: He has created and protected the 'free software' movement. This has been a difficult task against great odds and he has taken a lot of unwarranted personal grief over the years."

True Liberty in all of its various forms still gets a bad rap. It was ever thus.

I am gratified that a true champion of Freedom like Richard Stallman and people like Lawrence Lessig are still around. They have taken outrageous attacks over the years, but I think that might finally be starting to settle down for them. Of course, one worries that this might be preamble to attempting to co-opt them. Lessig appeared to cave a little on DRM, but I can't see Richard ever being anything but his own man.

Congratulations to the FSF and to all of us. It is a milestone worth celebrating.

Tuesday, February 19, 2008

Original DataHush Encryption Strategies


This section discusses some of the encryption strategies originally employed by DataHush. Some novel strategies remain unpublished.

In general, an encryption’s strength relies upon the following:

- Encryption algorithm – the ‘formula’ used to encrypt

- Length of key

- Processing power/time

We have the following techniques that we feel make it possible to strongly secure a transmission:

Dual encryption technique and compression

Two strong encryptions are used. One method is based on a known published method, the other proprietary. A third layer is related in that the stream is compressed according to one of a battery of techniques. Compression is a form of encoding that effectively strengthens the encryption, since even if the decompression technique is known, it increases the burden of overhead required to break the code.

Physical possession

It is possible to require a proprietary hardware device. This would require physical possession of the hardware device to make a transmission (the software would not work without it).


The system can be configured to require a challenge-response from either party to a transmission. This involves in addition a ‘two-way’ lock box of data that has never been transmitted, as well as a real-time requirement by a spoofing machine that will likely exceed the ability of any known machine.

Processor dependent key-scaling

This is an aid to making the encryption future-proof. The length of the key and the intensity of the calculations required are negotiated by either end of the system based on the CPU cycles available at either end. Ten years from now, the same software will require much greater capacity, even of a trusted party to decrypt. This means that if the processing power of a common workstation such as a PC is 4 orders of magnitude below that of the largest known machine, and it can force a real-time response that exceeds the capability of the larger machine, then as long as the differential in capacity holds true, the encryption can never be broken by superior processing power.

Two-way lock box

A large body of data used only as additional encryption will be transmitted by a trusted means to both parties. This store of data will be used by both parties as a method of lengthening the encryption key. Without access to this store, an intercepting party is forced to crack the encryption using the entire key.

Non-deterministic decryption algorithm

This technique is used to ‘up the ante’ in terms of required processing power. Not all of the information required to decrypt will be available to the receiving party. This can impose an arbitrary time of decryption, even if keys are intercepted. This will require the decryption process to actually guess part of the key. Sometimes, a packet will fail to transmit end to end, since the receiving party simply does not have the resources to decrypt. This introduces a further variable of noise that will confound an intruder, but be scaled within the limits of both ends of the trusted parties.

Decoying and nested decoys

Not all of the data in our secured transmissions will be data. Some of it will be noise, and the amount will vary from transmission to transmission. In addition, mock data that appears to be encrypted by simpler methods will be included in the transmission. This will occupy the resources of an intruder that might otherwise be engaged in breaking the true transmission. Decoying is nested at each level of the encryption process, requiring an intruder to follow many blind alleys at each level.

Public Key Encryption

Explaining Public Key Encryption

Many years ago, as a part of my company's research, I built a tool called 'DataHush'. It was a drag and drop encryption/decryption program. The tool, as I built it for original demonstration, did not include standard Public Key encryption. Patents encumbered well-known systems and I have always shied away from patents.

Despite not supporting it in the original tool, the design demanded Public Key encryption and supporting Infrastructure (PKI). It also had facilities that I felt improved upon the strength of PKI as generally practiced. I felt it necessary to explain to business partners just exactly HOW Public Key encryption worked and how, for banking and mission critical information PKI alone was (potentially) flawed.

I am currently working on a project where I have been asked to deliver some tools and protocols for an advanced secure infrastructure. First, though, it requires PKI and most people have such a hard time with the basics they have no hope of understanding the finer points of what my company has done to improve upon current PKI.

When I originally published information on our old system, I put together as simple an explanation as I could for PKI. This is still not entirely accessible to people without at least High School Math from the latter years. However, it does convey a concrete interpretation that should be meaningful to more people than typical discussions of this subject. So ... I dug up my old explanation (from the WayBack machine, bless them), formatted it for this environment and ... Here it is.

Here’s How it works

Public key encryption is based on a mathematical relationship between prime numbers, and the (presumed) computational difficulty of doing particular mathematical operations on large numbers( such as factoring (RSA) or discrete log (ElGamal)).

Here is an example of a particular public key system (RSA) whose strength depends upon the difficulty of factoring large numbers:

To start, we need to pick two prime numbers. In practice, these are very large multi-digit numbers. Here, we use small values to make it easier to understand.

Chose p1 and p2, say 3 and 11. We obtain an exponent value from the following equation:

(p1-1) * (p2-1) + 1 = x

For our numbers, this works out to be:

(3-1) * (11-1) + 1

= 2 * 10 + 1


x = 21

Now, multiply p1 by p2 to obtain a ‘modulus’ value (m). For our numbers, this equals:

3 * 11


m = 33

For any value (v) from 0 to (m-1), there is an equation that holds true:

v = vx mod m

Now we factor the exponent value such that factor 1 (f1) multiplied by factor 2 (f2) is equal to the exponent value. In our case:

f1 * f2 = x

3 * 7 = 21


f1 = 3 and f2 = 7

One of the factors is chosen as our public key, the other our private key. We make life easier for the public by choosing the smaller of the two.

To encrypt a message, someone takes the (known) public key and uses that to encrypt the message using the following formula:

Encrypted = Plain ^ f1 mod m

For our example, let’s say the letter G is being encoded and (to make the math easier) it’s the 7th letter in our alphabet. We assign it a value of 7. So:

Encrypted = 7^3 mod 33

= 13

To decrypt, you use the formula ‘in reverse’:

Decrypted = Encrypted ^ f2 mod 33

In our case, this yields:

Decrypted = 13 ^ 7 mod 33

= 7

We have our original message back. Math weenies go nuts for this stuff!

It is important to understand, math aside, that the encryption is not symmetrical. This gives it important properties, which we exploit.

1) You can only read a message encrypted with one key if you hold the other key.

2) Anybody can send you a secure message by encrypting with your public key.

3) If we can decrypt with the public key, the sender encrypted with the private one.

It is essentially (with some optimizations) item three above that constitutes digital 'signing'. If we know you are the owner of a given public key and we can prove that a message decrypts with that public key and your private key is only under your control, then you must have 'signed' that message.

Here’s How it Fails

Public key encryption has a point of weakness in that two vital pieces of information are known by the attacker - the public key and the algorithm chosen for encryption. Although no civilian scientist has published an elegant method of attacking the algorithm mathematically, it has not been proved invulnerable to attack. It may already be the case that military or government scientists have discovered a computationally simple way of cracking this method of encryption. It is certainly a possibility. Meantime, a modified brute force method leaves public key encryption highly suspect. Here’s why:

Although it is computationally very difficult to examine and try all of the prime numbers in the ‘open’ range, in practice the method of generating keys is such that once one of the keys is chosen, the other may fall within a range that can be narrowed down using the public key. Once you have chosen a limited selection of numbers to try, you simply run the whole set through the known algorithm until you reveal the private key. Once a private key has been discovered, it is useless and all messages encoded with that key are open to examination.

Friday, January 18, 2008

Life in Hell

Fans of Matt Groening ('graining') will get the reference.

Anyway, I am posting to gripe about yet another generic problem in my saga of endless updates to a very complicated small business and research system.

Here's the background:

I have six servers and nine workstations. In addition, I use two servers and a workstation via VPN at a client site. There are other people on the systems, but for the most part, the systems are designed so that I can use various things. It is like a gigantic workstation. Operating systems currently installed:

FreeBSD (Remote Server for ssh, ftp, http, database, mail)
Centos (Fedora derivative)
Ubuntu Server
Ubunto Desktop
Custom Linux for Wireless router.

Windows 98
Windows 2000 Pro
Windows XP Pro
Windows Vista Home Premium
Windows Vista Business
Windows Server 2003
Windows Terminal Server Edition

I have also various live-boot CDs that I use and USB (like DSL) and an ancient notebook running Windows 95. These are not all on the network proper. Believe it or not, there even more systems that are used to boot machines from time to time.

The list above, though, represents what I have to manage on an ongoing basis. Older machines simply require updates to ROM every now and then. So do devices. That means the hair-raising process of flashing the BIOS. That can sometimes kill a machine.

With that many devices, hardware is failing all the time. Things you don't expect such as motherboard network adapters just die without warning. Hard disks usually give some warning before they croak, but not always. Every now and then, the BIOS will reset on a machine and it has to be re-configured again. When devices require upgrades (like to GB Lan or 300MB wireless or even just a new hard-disk), all kinds of things just break. Of the last dozen manufacturer supplied drivers I have had, the software was out of date, was broken and required an upgrade. Invariably, the software helpfully tells you that you need new software and then directs you to a web page that does not exist.

I can live with all of the above stuff. What I can't live with are all the dead-stupid design flaws that make it almost a full time job just to maintain what is essentially a large workstation.

Last night, for instance, I told Ubuntu 6.06 Desktop to upgrade itself. After a few questions and confirmations it went merrily on its way saying it would take about 2 hours to do the update. This is on a connection that downloads 30MegaBytes a minute. Ok. I can take the bullet. It is just an adjunct workstation anyway. I did all the nonsense at 2:00 AM and then went to sleep. When I woke up, I found that it had more than an hour to go and had stopped dead asking me if I want to replace a file (that needed replacing). Having something like this in software is a show-stopping design flaw. I has been about ten hours since I started the upgrade and it is still running. It could have done that last night, except somebody who wrote the installation routine (for whatever piece it was) decided that they simply could not wait for an answer to that question and everything ground to a halt. This should never happen.

Windows, BTW, is no better. The only reason it goes faster is because I KNOW it is going to stop every 10 minutes to one hour (it can't really say when) to ask something that could have been asked at the end or the beginning for that matter. That means all the Windows upgrades are basically a half-day down the drain babysitting the constant re-boots. I have been programming at all levels for more than a quarter of a century (some of my code is actually in the Linux OS and applications on Windows, Linux and other operating systems besides). I can honestly see hardly any reason that a system should ever need to re-boot except for a design problem either in hardware or software. Windows can't seem to even install end-user programs in user space without re-booting. What is the deal with that?

Anyway, the big (abstract) gripe here is that more than ten years ago all software should have been incapable of these annoyances such as stopping dead in the middle of a long installation, requiring a system reset, failing to allow cut/paste/drag/drop operations where needed most (like what is the deal with an error message that requires you to transcribe the text by hand???)

A huge bugaboo with me is that in Windows I am constantly having the focus taken away for no valid reason by one of the dozens of programs I run. It happens right in the middle of typing and since I touch-type and multi-task it sometimes happens that a bunch of evil keystrokes end up going into the program that stole the focus.

Every single dialog of any sort should:

1) Have cut, paste, log of all text and message conditions. It has been typed in once by the developer. It does not require typing (and should not even require saving) by the user).

2) Have a NEVER BOTHER ME AGAIN checkbox. If messages are just plain critical, they should log them to another program and allow me to get to them as I am able.

3) Have roll-forward/roll-back and timeout so that if installing and not answered within (say) ten minutes it makes the best choice it can, saves backout information or roll-forward information and returns to the caller so it can continue processing.

4) Nothing should ever steal the focus. It especially should at least wait until the user is not in the middle of input.

Every major software vendor should have a 'sequencing team'. This team should review every bit of software during design, test and maintenance to ensure that these stupid, stupid, stupid sequencing errors do not exist. It seems bizarre to have to form a special team for this, but after more than a decade of constant blunders by every single vendor large and small it is clear that they need it.

I am preparing development guidelines for my company and have a whole bunch of annoyances listed already. If you leave a comment here with one of your pet peeves and it is not already on the list, I will add it to that (LONG) list.

Javascript webp to png converter

[Done with programmer's assistants: Gemini, DALL-E] OpenAI's DALL-E produces images, but as webp files which can be awkward to work ...