Bits, Bauds, & Modulation rates

Serial communications, here meaning RS-232 and current loop

1999, revised slightly February 2005


mark and space
--------------

Serial lines are either spacing, or marking. I was always confused as to what mark or space meant, which was the state of the line when no characters were being sent, etc. I can't understand why now, it's amazingly simple.

The resting state is called "spacing". Mark makes marks on paper -- this was literally the genesis of the term, ye olde telegraphe lyne rested in the spacing state, the telegraph key when pressed imposes the marking state on the line -- and the remote telegraph receiver makes a MARK on the paper strip.

Simple!

150 years later, the meaning has hardly changed; the "start bit" is always a change from spacing (doing nothing, the resting state) to a mark state, eg. doing something.

The thing that is "logically" backwards is that resting means current flowing; marking means the current is interrupted. For a teletype or telegraph circuit, current flows through the circuit, and the selector magnet, keeping the mechanism from doing anything. There are two major reasons for current-flow=no-data: it provides an immediate indication of a problem if the mission-critical wire or current flow fails; teletypes chatter way when receiving "all ones" (eg. all marking). Two, the continuous current has the positive side effect of "sealing" the metal circuitry; it helps prevent corrosion at screw terminals and the like. (With significant current flowing through a circuit, any potential drop across an imperfect connection swamps any low-level electro-chemical corosion caused by dissimilar metals, etc. This "sealing current" is a major contributor to today's telephone network reliability.)


the character of characters
---------------------------

Characters are little packets of data, impressed on a wire by changing the voltage or current in a predetermined manner. The data within the character are "bits", in our world, from seven to eleven of them, generally seven or ten. More on this later.

Character streams are either synchronous, in synchrony with some reference time standard, or asynchronous, able to occur at any time, independent of any character(s) preceding or following.

Asynchronous is all we care about here.

Async (for short) characters contain synchronization and timing overhead within each character-structure, that take the form of additional bits, more than are required to merely contain the data. There are almost always two of these "overhead" bits.

(Synchronous protocols are not part of this story, but briefly: synchronous character streams do away with the need for per-character overhead by using an agreed-upon external timing system. Added complexity, decreased flexibility, but generally, increased speed. Enough said.)

Character data, either five (ITA2) or seven (ASCII) bits, are transmitted one bit at a time. The most-significant bit is transmitted first; it's not arbitrary, the reasoning is mentioned later. A voltage or current is impressed upon the wire for a predetermined and agreed-upon time, per bit. If the bit-time is 20 milliseconds, and there are five bits, it takes 100 milliseconds to transmit the data-portion of a character.

However, in the "real world" (sic), you can't just surprise some device with data bits. You have to give them some warning. This is where the overhead bits come in.

Historically, characters were sent and received by teleprinters. When a key is pressed, a character is generated (and obviously the character is "asynchronous", because humans pressing their fingers upon metal levers is "asynchronous", eg. last I knew my fingers don't adhere to a precision clock source).

Sending is easy, receiving is hard.

An overhead bit precedes each character's data; an additional overhead bit follows it. These are called (duh) start and stop bits. Here is a picture of the letter "A" (from the ITA2 character code) being sent:



                 S   4   3   2   1   0  S
---------------+   +---+---+---+       +--------------------
               |   |           |       |
               +---+           +-------+
   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .


The dots show the relative bit timing.

The start bit is always a transition from spacing (doing nothing) to marking (doing something). This tells the receiving mechanism to wake the hell up, and get ready for the first data bit which will begin with the next bit time. Then follow the five data bit 'slots'. In this case, the five bits are 0, 0, 0, 1 and 1 (at bit #'s 4, 3, 2, 1 and 0). The "stop" bit, you will notice carefully, is simply the line remaining at it's resting, or "spacing" state. This does two things. One, it provides an entire bit time for the mechanism to actually print the character (remember for the first 100 years this meant actually moving a non-virtual metal lever, with actual mass, against the paper, and back), and provides a decent margin of error, in case the sender and receiver are not running at precisely the same exact speed.

This last cannot be over-emphasized; in the earliest machines, the bit-rate, the speed of the machine, was determined by a mechanical governor; a mechanical subassembly that controlled the speed of a rotating shaft. Mechanical governor's accuracy and rapidity of adjustment are appalling by today's standards; plus in the real world things had to work in dirty, dusty environments.


sending
-------

In the transmitting device, an electric motor is rotating reasonably close to the intended speed. A pressed key, let's assume the "A" key, forces five steel rods into a pattern (say: left/left/left/right/right) that reflects the character code for "A". On the first rotation after the key is pressed, the "start" bit is impressed on the wire with a simple switch. Then for each of the next five rotations of the motor's shaft, each of the keyboard-actuated rods is "sampled" with five simple switches; if the rod is to the left, the switch is closed; if to the right, the switch opens, forcing a MARK state upon the wire. On the seventh rotation the mechanism simply "wastes time", thereby asserting the stop bit on the wire.

The result is a voltage or current that if displayed on an oscilloscope would look like the schematic drawing above. Not bad for 1899.

(Today, a quick microprocessor asserts each bit onto the line, and waits for a pre-determined time between bits. Much simpler, assuming you have a few hundred thousand transistors available.)


receiving
---------

Assume the same letter "A" transmitted above is received by our hypothetical mechanism.

Assume further that the recieving (and printing) device has five levers, the particular left/right arrangement of, is used to select an inked letter to press on the paper. The five rods, to print "A", need to be left/left/left/right/right. Each rod's position is manipulated with an electromagnet called a "selector".

The line is in the resting, or spacing, state. The selector magnet is pulled in, because the wire rests with current flowing. A motor is running, if all is well, at the same speed (or some exact multiple thereof) as the transmitter's motor.

The start bit appears: the current drops to zero, the selector magnet lets go, and a clutch grabs the motor shaft and turns the decoding mechanism within a millisecond or two.

The first revolution lets the motor bring the mechanism up to speed; it's metal, has mass, and hence is not instantaneous. (Teletype Corp. teleprinters have a device called a "range finder", a manual adjustment to compensate for mechanical delays and variations within individual machines.)

By the second revolution it is ready. One by one, each rotation of the shaft, mechanically arranged to take exactly one bit-time per revolution, moves one of five small levers to the left or the right, depending on whether, at the start of each revolution, the current on the wire (and therefore in the selector magnet) is on or off. Let's assume that if the current is on (spacing) a lever is moved to the left; if the current is off (mark) it is moved to the right. When the sixth bit (start, plus five character data bits) is received, the seventh rotation forces a lever, with a human-readable character pressed onto it's end, against the paper. If the machine is clean and well-oiled, the print-lever is back in it's resting position long before the next character is complete.

(In a microprocessor system, when the start bit is detected, the little fast CPU simply waits out time between each bit, reading the wire and assembling the character in some memory location. Easy enough with a few hundred thousand transistors available.)


bits, bauds, and modulation rate
--------------------------------

Bit rate, baud rate, and words per minute are really simple. Really. The bit rate is simply a measure of, well, the rate that bits are transmitted. It is the reciprocal of the period of each data bit. If each bit persists for .003333... seconds, then the bit rate is 300 baud.


         1
      ------  =  .00333333...
        300

The BAUD RATE and the BIT RATE are the same iff (if and only if) all of the bits that make up a character are the same period; the same length. Uh oh. Yes, in the old teleprinter world, sometimes the stop bit was longer than the other bits, to give the mechanism time to whack the character onto the paper and get ready for the next character. Yes, it's a cheat.

There is an easy rule of thumb: if the bit (or baud...) rate is above 110, then bit rate == baud rate. The odd length stop bit business was only for mechanical teleprinters, which nearly always are slooow, 110 baud or less.

Also, more or less universally all non-mechanical asynchronous serial it bit rate is a multiple of 300. I don't know why 300; it's a multiple of 60 (and 50), not coincidentally the near-universal power line frequencies.

(So what's a baud? It's the number of information-carrying states per electrical-state change; for all serial communications of this type, one electrical state change (eg. mark to space, 0 to 1, on to off, etc) conveys one bit of data. There are more efficient (and complex) schemes that convey more than one bit of data, depending on what the *previous* state was... but this is most definitely outside our scope. Here you can assume one bit per baud, or bit=baud.)

Always, without exception, start bits are the same length as data bits. Always, without exception, all data bits are the same length. Because stop bits are always spacing, the receiver never has to worry about how long stop bits are; as long as the mechanism 'recovers' before the next start bit, a long stop bit is indistinguishable from simply a delay before the start of the next character.

Too-long a stop bit is never harmful; it simply wastes a little time between characters. Too short a stop bit can cause 'pileup', and mis-printing by the receiver.

Stop bit length is expressed as a multiple of the start and data bits; for example, for the ITA2 code, stop bits are 1.42 times the others. For 110 baud, which generally means a Teletype Corp. Model 33, stop bits are 2.0 times the others; just to be confusing, many times it is labelled "two stop bits", which amounts to the same thing.


words per minute
----------------

In ye olden dayes of telegraphy, it was determined statistically that the average word in a telegram was six characters (seems short to me).

Most teletypes are rated in "words per minute", rather than bits per second or baud per second. The speed in words per second is simply the character speed (in bits per second) times 60 seconds per minute, divided by six characters per word; in other words (sic), bits per second times 10. Then because this is the horse'n'buggy era, round off to the nearest convenient number.

For example, standard "66 wpm" of the ITA2 code (5 bits per character) is six bits at 22 milliseconds each, plus a 31 millisecond stop bit (31 is 1.42 times 22). That's 163 milliseconds per character, or 6.13something characters per second. Times 60, divide by 6, is 61.3 words per minute, only this is the olden dayes and we round it off to 60 wpm. Voila!