The letter frequency counts (left-most column) are taken from one of the common books on cryptanalysis, based on number of
occurrences per thousand of normal English text material. Each character is analyzed ("structure") into units, 1 for minimum
signal duration (one dit), 111 (three units duration) for a dah, and each equal unit of silence denoted by 0 (zero). The required three units of silence separating each character is added (000) to each one below.
Freq. Letter Structure Units Total
130 E 1000 4 520
92 T 111000 6 552
79 N 11101000 8 632
76 R 1011101000 10 760
75 O 11101110111000 14 1050
74 A 10111000 8 592
74 I 101000 6 444
61 S 10101000 8 488
42 D 1110101000 10 420
36 L 101110101000 12 432
34 H 1010101000 10 340
31 C 11101011101000 14 434
28 F 101011101000 12 336
27 P 10111011101000 14 378
26 U 1010111000 10 260
25 M 1110111000 10 250
19 Y 1110101110111000 16 304
16 G 111011101000 12 192
16 W 101110111000 12 192
15 V 101010111000 12 180
10 B 111010101000 12 120
5 X 11101010111000 14 70
3 Q 1110111010111000 16 48
3 K 111010111000 12 36
2 J 1011101110111000 16 32
1 Z 11101110101000 14 14
1000 Ave. Structure length 11.23 Ave. 9.07 9076
From the above, if we take five times the above average letter length and add the space required for word spacing (seven total
or 0000000) we arrive at the normal English word length as 5 x 9.076 + 4 = 49.38. This is just a bit less than 1% shorter than
50 units per standard word. (By contrast, a random five-letter group averages 60.15 units. This is 20.3% longer than normal
English word length.)
A similar analysis of numbers will show that the average length of a number is 17 units (minimum 12, maximum 22) or a group of five numbers takes about 1.78 times as long to transmit as a five letter word.
Comparing these calculations will show some of the reasons why receiving speeds vary with the kind of material being sent.
As a matter of interest, we list here the letters from the shortest to the longest by the number of units (less letter space) -- notice that all lengths are odd numbers: 1 - E; 3 - I, T; 5 - A, N, S; 7 - D, H, M, R, U; 9 - B, F, G, K, L, V, W; 11 - C, O, P, X, Z; 13 - J, Q, Y.
If the same kind of calculations are carried out for several foreign languages, the following results are obtained for the average character length: (Frequency data from Secret and Urgent, Fletcher Pratt l942 Tables II to IV, p. 253 ff.)
German 8.640, French 8.694, Spanish 8.286 . These range on the average from 5 - 9% shorter per character than in English.
There seem little doubt that if the code were somewhat redesigned and adjusted to optimize it for English a reduction of about 5% could be made.
For the Original American Morse code:-
Mr. Ivan Coggeshall made an analysis of American Morse comparatively, using the same normal dah lengths and word spacings one unit shorter, and arrived at an average letter (frequency) length of 7.978 (as compared with 9.076) and average number length of l4. As noted in Chapter 16, American Morse timing is open to considerable variation.
The Art &Skill of Radio-Telegraphy
©William G. Pierpont N0HFF
This page last updated August 02, 1998
Modifications and compile by Thom LaCosta - K3HRN - December 2004