Chain tension is slightly offset from where the axle passes through the frame. That offset load bends the axle, and if it the load is big enough and applied enough times, the axle will break.
Early bicycles used a single rear sprocket, and axle sizes selected according to the materials available and loads imposed. The bearing was nearly at the dropout, so bending loads on the axle were small.
The distance from bearing to dropout is typically called "overhang".
When derailleur gearing was introduced, two sprockets could be fit in the same dimensions, thus without significant changes in axle service life.
When three-sprocket gearing was introduced, the hub shell and bearings were shifted slightly left, leading to a slight increase in overheang. However, the dimensions and materials were good enough to support the loads.
The development of wider 4-sprocket clusters increased axle overhang and thus axle failures. 5-speed clusters followed with slightly narrower chains and sprocket spacings for about the same overall width. Broken axles were an ongoing problem, but were accepted as a necessary exchange for wide range gearing.
In the 1980's, makers introduced 6-speed and 7-speed freewheels. At the same time, dropout spacing increased from 120 mm to 126 mm and for mountain bikes to 130 mm. Note half the distance is added to each side, so adding 10 mm total adds only 5 mm for the sprocket cluster. SunTour offered narrow sprocket spacing for 6-speed clusters, and other makers followed for 7-speed clusters. Overhang varied, but overall stayed the same or got worse.
An obvious solution is a larger axle, but the bearing outside diameter typically must be small enough to fit inside the freewheel threads. A larger axle could be accomodated using smaller bearings, but bearing cone pitting was already a problem, suggesting smaller bearings would be a poor tradeoff.
That said, broken axle problems can be solved using freewheels. 1970's freewheel hubs such as Phil Wood and Bullseye have larger axles, and use smooth axles rather than threaded — thus avoiding stress raisers. As a result, these hubs have markedly fewer axle failures. Indeed, such hubs were commonly used without failure by heavy riders and on tandems.
Below is a Bullseye hub, designed around the largest common bearing that fits inside standard freewheel threads, and using a 12 mm axle. Note the use of a center spacer between the bearings (seen through the cut-out): once the hub is clamped in a frame, the stack of spacers (and the bearing inner race) become a structural member, so the axle is effectively larger than 12 mm.
A larger axle could have been fitted using a lower-profile bearing with the same outside diameter. A trade-off is a thinner-section bearing has a lower load capacity; and using multiple bearings (or a single bearing with multiple rows) does not typically make up the difference. Bearing in the Bullseye hub is a 6001 with 28mm outside diameter. A 6902 has the same outside diameter but accepts a 15 mm axle. Capacity varies with design details, but for two bearings of similar nominal design, the 6902 has about 85% of the load capacity.
An even larger axle can be fitted by moving the bearing inboard of the freewheel threads. Doing so increases bearing overhang, but allows a much larger axle, which can more than make up for the increase in overhang — note that bending strength increases as roughly the cube of the diameter, so doubling the axle diameter gives roughly eight times the bending strength, more than enough to make up for the load from increased overhang.
However, in the 1970s and early 1980s, hubs with oversize axles were markedly more expensive. Oversize axles and cartridge bearings can be made cheaply, but the hubs available at the time were much more expensive than most mainstream cup-and-cone hubs with standard axles. Price (in part) kept durable axles from being used widely.
Another approach would be an axle with a shoulder for the right-side cone, such as is used in the Britanica hub diagram at the top of the page. A shoulder means point of highest loads is slightly larger in diameter, and is also free of stress risers. Shoulders were not widely adopted. Shouldered axles are only slightly more expensive to produce and could be used without changing bearing diameter, but axle bores in conventional bearing cups and hub bodies were typically a bit too small for the shoulder. It would have been easy enough for makers to use slightly larger dimensions, but they did not.
Another problem with a shouldered axle is it locks the dimension from the right hub flange to the dropout. In many cases this dimension is standardized, but in gneral would have required makers to offer a family of axles or axles with special spacers. This would not have been technically challenging, but for whatever reason but was not done.
In the early 1980s, Maillard introduced their "Helicomatic" design, in which the hub shell was extended towards the dropout, and a freewheel slid over the shell extension :
The Helicomatic essentially elimintes bearing overhang and thus broken axles. The design used a thin extension to fit small sprockets, and to fit bearings inside the thin extension, the hub uses smaller bearings than standard hubs. As a result, bearing failures were a common problem.
Another problem with the Helicomatic is it is a proprietary design that was, therefore, not adopted by other makers. It also has somewhat greater manufacturing costs to build the helical spline. (The helical spline retained the freewheel and allowed much easier installation and removal than a conventional freewheel, a decided advantage for sport riders, but again a problem for manufacturing cost.)
In the 1980s, Shimano introduced modern "freehubs", in which the freewheel mechanism is integrated with the hub. In the Shimano design the right bearing is outboard under the sprocket cluster. The Helicomatic's "small bearing" problem is avoided by using a thread-in right bearing race. Although the difference in dimensions is slight, it is enough that conventional bearings can be used, and right bearing failures are no more common than with freewheel hubs.
These hubs are also called "cassette" hubs. The name comes from the design which allows a stack, or cassete, of splined sprockets to be installed and removed together. The cassette sprocket cluster is not fundamental to elimination of broken axles, but cassettes can be removed much more easily than conventional freewheels. Freewheels are threaded on, self-tightens under load, and thus are often a problem to remove. The cassette approach gave the Shimano design an additional reason for users to adopt hubs using the design, and Shimano was a large maker so buyers felt sure of good parts availability despite the nonstandard design.
The basic Shimano patent covered a ratchet mechanism bolted to a hub shell. SunTour circumvented this patent by bolting the hub shell to the ratchet mechanism — a mechanically tiny difference that was enough to allow them to build their own freehub system.
Small-volume hub makers often bought Shimano ratchet mechanisms on the open market and made their own hub shells.
Many other makers switched to a layout in which the hub shell runs on bearings on the left side of the axle, and the cassette spline runs on the right side. This design reintroduced bending loads on the axle.
However, since the hub is proprietary anyway, makers have felt free to introduce nonstandard axles, which are smooth and thus free of stress raisers; and often much larger diameter than standard freewheel axles. In the early years of this design, failed axles were common, but in a few years most makers had figured out the needed designs and dimensions to get good axle and bearing life.
A difference from the overhang of a freewheel hub is that the chain on a freewheel hub pulls cantilevered on the hub shell bearing, whereas on a cassette (except Shimano/SunTour) pulls directly on the axle, so while bearing overhang is typically worse, axle bending loads are not further compounded by freewheel overhang.
A problem with Shimano and SunTour cassette hubs is the ratchet mechanism is dramatically smaller than on a freewheel, and thus subject to much higher loads for a given chain tension. Thus, pawls are much heavier and ratchet failures were for a long time fairly common. Modern versions of these seem to have low ratchet failure rates, although the ratchets are still quite small.
It is worth noting that little of the technology in cassette hubs was new at its introduction, although the combination of ideas was a useful advance. Among other things, note the four-bearing design appeared at least as early as 1938 with the Bayliss-Wiley "Unit Freewheel" system.
It seems remarkable that makers did not introduce cheap freewheel hubs with more durable axles before the freehub took over. The technology all existed at the time, and at least some inexpensive cartridge bearing freewheel hubs with oversize axles are now common for BMX use. It seems most likely the problem was inertia by the makers — as long as users are willing to put up with broken axles, why fix the problem?
 Photos from http://www.flickr.com/photos/stronglight/sets/72157604719497074 as of 2011/05.