The History, Development and Impact of Computed Imaging in Neurological Diagnosis and Neurosurgery: CT, MRI, and DTI
ct scanning, diffusion tensor imaging, fmri, history, magnetic resonance imaging, tractography
A Filler. The History, Development and Impact of Computed Imaging in Neurological Diagnosis and Neurosurgery: CT, MRI, and DTI. The Internet Journal of Neurosurgery. 2009 Volume 7 Number 1.
A steady series of advances in physics, mathematics, computers and clinical imaging science have progressively transformed diagnosis and treatment of neurological and neurosurgical disorders in the 115 years between the discovery of the X-ray and the advent of high resolution diffusion based functional MRI. The story of the progress in human terms, with its battles for priorities, forgotten advances, competing claims, public battles for Nobel Prizes, and patent priority litigations bring alive the human drama of this remarkable collective achievement in computed medical imaging.
Atkinson Morley's Hospital is a small Victorian era hospital building standing high on a hill top in Wimbledon, about 8 miles southwest of the original St. George's Hospital building site in central London. On October 1, 1971 Godfrey Hounsfield and Jamie Ambrose positioned a patient inside a new machine in the basement of the hospital turned a switch and launched the era of modern neurosurgery and neuroimaging.
Henceforth, there was a saying at Atkinson Morley's that “one CT scan is worth a room full of neurologists.” Indeed, neurological medicine and neurological surgery would never be the same. Everything that neurosurgeons had learned about diagnosis and surgical planning before that first scan was totally transformed by that event.
What came together on that remarkable fall day, was a confluence of mathematics, science, invention, clinical medicine, and industrial resources that all arrived at that one time and place in a dramatic and powerful way. From a number of points of view, that first scan was no surprise to those who made it. Like Damadian's first MR image in 1977, Ogawa's first fMRI image in 1990, the first DTI image in 1991, or the first neurography image in 1992[3-5], Hounsfield's first scan was simultaneously expected yet astonishing[6-9]. The participants knew generally what they hoped to see, but in each case the result both met and exceeded the dream. The scientist was rewarded by the shimmering appearance on a computer screen of a view of the human body that no one else had seen before.
Because of the complexity of computed imaging techniques, their history has remarkable depth and breadth. The mathematical basis of MRI relies on the work of Fourier - which he started in Cairo while serving as a scientific participant in Napoleon's invasion of Egypt in 1801. Diffusion Tensor Imaging relies on tensor math that was developed in part by Albert Einstein in his efforts to summarize the transformations of space and time in his general theory of relativity. The physics involves matter-antimatter reactions, nuclear spins, and superconducting magnets. What we can see ranges from the large tumors of the first CT images to the subtle patterning of fMRI that reveals the elements of self and consciousness in the human mind. Medical imaging is starting to press upon the edge of philosophy itself.
Another reflection of the complexity of these technologies is that each major advance has a variety of facets - many different competing inventors and scientists therefore seem to see primarily their own reflection when looking at the same resulting gem. Lenard fought bitterly with Rontgen over the discovery of the X-ray, continuing to vigorously attack him and his work for decades after Rontgen had died. A dozen inventors of tomography fought each other for priority until their shared technology was abruptly superseded by CT scanning so that all of their works faded into irrelevance before the dust of the internecine battles could even begin to settle. Efforts by Oldendorf and by Cormack to develop computed tomography were totally outrun by Hounsfield because his employer EMI (Electrical and Musical Industries, LTD) was buoyed by a vast cash geyser from John Lennon and Ringo Starr - the competitors couldn't beat an engineering genius funded by sales of Beatles records in the late 1960s. Damadian pled his rage to the world in full page ads in the New York Times when the Nobel prize committee discarded his contribution in favor of his longtime rival Paul Lauterbur and for Mansfield who he regarded as totally insignificant.
In fMRI, one group from Harvard's Massachusetts General Hospital grabbed the scientific imagination with Belliveau's dramatic cover illustration in Science, but eventually lost out to the rightness of Seiji Ogawa's model of fMRI using BOLD (blood oxygen level dependent) MRI that did not require injection of contrast agents and which was published a year earlier. Filler, Richards, and Howe published the first DTI images in 1992,[2, 13] but Basser and Le Bihan at NIH who hold a competing claim to the invention of DTI never referenced the work by the Filler group even once after more than 17 years.
The Basser and LeBihan story is most illuminating as a number of historians have marveled at how in the 1930's and perhaps as late as the 1960's, major early workers in a given field of imaging research could progress over years without being aware of each other's work.[14, 15] The apparent lack of awareness has appeared to be due to barriers to information flow in the 19th and earlier 20th centuries such as difficulties for English speaking scientists in accessing work done outside the major Western centers of science. The prime example of this is Tetel-Baum's early work in CT scanning[16-18] done in Soviet era Kiev in the 1950's and published only in Russian - which had to be “discovered” many years later. The DTI story shows that the causes of apparent unawareness may have more to do with anthropology and psychology than with limitations of academic communication
The story of computed imaging therefore provides both a fascinating opportunity to understand the progress of a science that underlies much of what neurologists and neurosurgeons do today, as well as providing a riveting view into competition and victory in the arena of scientific accolades, clinical impact, patent litigation, and the media, as well as the ultimate judgment of the eyes of history.
X-rays and Tomography
Discovery of an unexpected natural phenomenon coupled with the eery ability to see the skeleton in a living person captured the world's imagination on an almost explosive basis when Wilhelm Rontgen (Figure 1) showed his first images. He had been working with apparatus developed by Lenard that was used to generate “cathode rays.” These are electrons generated in a glass vacuum tube when a voltage is applied between a cathode and an anode. When the electrons strike the glass, they cause it to glow - and this can be seen in a darkened room. Rontgen had sealed up a tube to be sure no light would be emitted from the glass so that he could see if the cathode ray would penetrate the glass to strike a piece of cardboard next to the tube that had been painted with a flourescent substance - barium platinocyanide. However when he tested the device to make sure it was completely light sealed, he noticed a glow from a table at some distance away - a distance far too great to be reached with cathode rays. On the table was a sheet of cardboard painted with the barium platinocyanide. He turned off the current to his tube and the glow from the distant table stopped abruptly. He turned on the current to the tube, and distant cardboard immediately began to glow again.
He discovered this on November 8th 1895, but told no one, working feverishly in secret for seven weeks to fully explore his discovery of “X”rays. Finally he submitted a publication that showed a photograph of a skeletal hand. The report published on December 28, 1895 and although the first few newspapers he approached declined to report about it initially, the editor of an Austrian paper did run the story and the news was then rapidly picked up and reported in newspapers around the world.[14, 15]
We now understand the X-rays to be electromagnetic radiation emitted by electrons that have much higher energy and far shorter wavelength than photons of light. Formerly, there was a distinction made between X-rays and gamma rays based on the even higher energy and even shorter wavelength of gamma rays. There is now thought to be so much overlap in the spectra that the two are distinguished by source - gamma rays originate in the nucleus. Although Rontgen really had no idea at all what his “X-rays” were, he was the first winner of the Nobel Prize in physics which was awarded in 1901.
Philipp Lenard, however, was furious that he did not get the prize and the recognition since the apparatus and basic experimental set up were his. He also insisted that he had seen the same phenomenon of distant fluorescence and was doing a more reasoned and formal investigation of the physics before Rontgen rushed out with the dramatic photographs of the skeletal fingers. Although Lenard was awarded the Nobel prize himself in 1905 for his work in cathode rays, he continued to bitterly criticize Rontgen. Philipp Lenard later attacked Einstein for differing from him over the behavior of cathode rays. Still later, Lenard became the Chief of Physics under Hitler - in which position he attacked Einstein's physics as a fraud which no doubt allowed the Germans to fall far behind the allies in the development of the atomic bomb.
Against the drama of the discovery of X-rays and its truly electrifying effect on the world at large, the history of tomography presents a very pale shadow. The driving idea here was to get a better look inside the chest so that the heart, lungs, and any tuberculosis or tumors could be better seen with out the interference of the rib cage in front and behind. The fact that radiologists today still rely primarily on non-tomographic chest X-rays speaks volumes about the clinical impact of the whole endeavor. Essentially, the idea is to move the X-ray source to the left while the image plate is moved to the right. The axis of rotation of a line from the source to the plate must be on a plane of interest inside the body. The result will be that structures in the middle of the patient (along the plane of the axis of rotation) will remain relatively clear while those in back and front will be blurred.
The various patents and theories of accomplishing this varied in regards to details such as whether the source and plate would be linked rigidly as by a pendulum (the plate goes through an arc) or alternately, whether the plate would remain parallel to the imaged plane in the patient, and so on (see Figure 2). Each different method had a different name - stratigraphy, planigraphy, sterigraphy, laminography, etc. The patents were often competing and overlapping, but filed in different countries. The machines were generally mechanical devices with hinges, and levers and pendulums. The patents were typically pure mechanical devices without any obvious proof that the images produced were better or worse than those from any other method if any images were produced at all. There was no serious mathematical or physical analysis of the designs.
With the elapse of time, the tomographic systems became more complex without becoming any better or more useful. The movements of the source and plate could be quite complex involving circles and spirals. Systems were provided for advancing the plane of imaging so a series of tomograms could be made.
In 1937, William Watson filed a patent application (granted in 1939) for the first important system that made a series of axial tomographic images slices. The patient sat on a stool that was rotated through a full circle as the image was being made, while the X-ray plate was moved as well. The seat of the stool was mounted on a telescoping column that could be raised or lowered to get axial sections at various locations in the body. This device actually achieved considerable commercial success.
A number of inventors developed devices that rotated the patient. Among the most interesting aspects of these later developments conceptually was the emergence of the idea of 'non-blurring' tomography that could produce an axial cross section of the patient in which the tissue outside the plane of interest was not even exposed. Although these worked, they involved a truly enormous amount of X-ray exposure.
From Axial Tomography to Computed Axial Tomography
The next important advance was a non-computed axial tomographic device that employed the novel idea of “back-projection.” This is also one of the fundamental aspects of Hounsfield's computed tomography and of Paul Lauterbur's initial MRI design. It is certainly the single most important technical advance to emerge from the sixty year history of non-computed tomography. Gabriel Frank filed a patent in 1940 that fully worked out the methodology for this approach to imaging[14, 20] (see figure 3).
In back-projection, an emitter shines an x-ray through a subject to a “receiver” that transduces the incoming X-ray light to produce a linear trace on a rotating drum. Gradually, the X-ray source and the collimated entry filter of the receiver are swept from the one edge of the subject to the other. If the subject is a phantom cylindrical column made of perspex with a dense dowel at its core, then the receiver will show a high intensity linear trace as the beam progresses steadily across the perspex, then drops off abruptly when the beam line crosses the dowel and then comes back up again once the dowel is passed.
We now have a linear trace that describes the position of the dowel as an area of decreased exposure along one portion of the detection line. We then rotate the subject a few degrees and do the same thing again producing a new trace and continue to do so again and again until traces are obtained from a large number of directions (for instance 360 views if the subject is on a turntable that is turned by one degree just before each complete edge to edge trace is carried out).
At this point we have a drum with a record of the series of linear exposures - much like an archaic grammophone cylinder. We can now use the lines recorded on the cylinder to drive an exposure light to create a film image of what has been recorded. Note that the data collection step is now completely separated from the film exposure step.
We lay a sheet of unexposed film flat on a turntable. We have a light source that shines a narrow beam across the film. When the beam is on, it exposes a line of light onto the film. When the beam is off, no exposure takes place. We shine a focused line of light from the source from one edge of the film to the other edge, controlling the intensity of the exposing light by the intensity recorded on our trace. As the source moves across from one edge towards the other it remains dark, but when it encounters the blip where the dowel blocked the x-ray, it turns on the beam and a line of light is exposed onto the film at that point. The turntable is then rotated and the next line played out. Eventually, a thin line of light will be projected across the film from one point during each of the 360 differently angle exposure traverses. Importantly, all of the 360 lines will cross at just one point on the film. This point will have by far the greatest exposure and this point will expose as bright white exactly where the dowel was in our perspex model.
This is the fundamental idea of back-projection. The idea of positron emission tomography, computed axial tomography, and Paul Lauterbur's MRI is to do what the mechanical back projection system has done, but to do it quickly relying on electronics, computer reconstruction techniques, and more advanced physics.
Invention and reduction to practice of CT scanning
We know the most about four entirely independent researchers who saw the opportunity to take advantage of recent advances in computers in the 1960's to develop a computer based, back projection, axial tomography system. These workers published and filed patents as they progressed. Hounsfield was the fifth worker. He was not an academic. He did not publish. He only filed patents very late in the process so that most of the work was done before the patents were published. He was funded internally at EMI so there were no grant proposals. He had an unmatchable budget to do his work. He made a series of well executed stepwise advances that allowed him to continue to work in secrecy while renewing his financial support within the corporation.
Oldendorf at UCLA developed a model that differed from the non-computed axial tomogram in that the subject moved along a line as it rotated.[14, 15] A computer then sorted out the motions to carry out the back projections and display the results on a computer screen. He presented it to an imaging manufacturer and was patiently told that there was no use for his machine - so he abandoned the effort. A group in Kiev built a working model, but published in Russian and never progressed the work.[16, 17] At Massachusetts General Hospital, Brownell and Chesler used positron emissions in a computed back projection system and then later developed a system in which a gamma ray source was used to carry out a transmission computed tomographic image experiment.
Allan Cormack was a South African physicist who joined the faculty at Tufts University in Boston in 1957 and later published (in 1963 and 1964) a solution of the problem of “line integrals”[23, 24] a mathematical technique that is used in most modern CT scan computation - although Hounsfield did not use this mathematical approach. Cormack was awarded the Nobel Prize in 1979 for the invention of CT scanning along with Hounsfield. It was later appreciated that several decades earlier, Johann Radon (an Austrian mathematician) had solved and published much of what Cormack had done. It was also appreciated later that further advances on the mathematics had also been published previously - in Russian - by the Kiev group. Hounsfield cites Cormack's papers in his 1968 patent submission (granted in 1973) but dismisses Cormack's math as not usable for practical applications. None of the others (Cormack, Kuhl, Oldendorf) knew of Hounsfield's secret work.
Hounsfield's biggest setback came when the moment arrived to travel to the National Neurological Hospital at Queen's Square in London to meet with the chief of neuroradiology. He explained what he had accomplished and proposed the construction of the first tomographic scanner in order to make computed tomographic images to show slices of brain structure in patients. The neuroradiologist patiently explained to Hounsfield that with pneumoencephalography, plane tomography, and angiography, there was no existing brain lesion that could not be diagnosed by imaging already. There was no obvious clinical used for a computed tomogram machine as tomograms in general weren't really all that useful. He was sent packing. It is told apocryphally at Atkinson Morley's Hospital (AMH) that as soon as Hounsfield had left, the radiologist at Queen's Square took the time to pick up the phone to call the official at the ministry of health who had sent Hounsfield to see him - the official was warned in no uncertain terms never again to waste the radiologists time with crackpot inventors peddling ridiculous contraption ideas such as this.
Hounsfield, of course, figuratively picked himself up, dusted himself off, and managed to solicit a referral to the chief of neuroradiology at the number two neurological hospital in London - Jamie Ambrose at Atkinson Morley's Hospital in Wimbledon - the initial meeting took place in 1967. Ambrose had an interest in using ultrasound to image inside the skull and was familiar with the axial tomography concept. He liked Hounsfield's proposal, and others at AMH thought it sounded sufficiently eccentric and interesting as to be worth a try (see figure 4).
The entire staff of the hospital was sworn to secrecy during the duration of the construction and testing. Atkinson Morley's is fairly secluded and surrounded by woodlands on three sides so secrecy was easily achieved. The machine was built along a plan for commercial production. The first test resulted in the images shown as figure 5.
It was time for the first patient - the data tape was collected and then sent across London for analysis, computed back projection and image reconstruction. The new image tape was rushed back to AMH where the result was viewed by Jamie Ambrose. It was immediately apparent to the assembled neuroradiologists, neurologists and neurosurgeons of AMH that something of truly spectacular clinical utility had emerged. Several more patients were scanned - each with complex pathology, each producing a crude but riveting set of image scans.
Photographs of the first five patient scans hung on the wall of the radiology department of AMH until the service was moved out to an “Atkinson Morley Wing” of the new St. George's Hospital in Southwest London in 2003. Figure 6 is a photograph taken of the wall in the radiology reading room at AMH and figure 7 shows Hounsfield holding a data tape, talking with one of the first CT technologists.
The result was announced to the world and received enormous media and clinical attention. Hundreds of radiologists, neurologists and neurosurgeons from around the world headed for Wimbledon to see the new machine at AMH. Orders poured in to EMI despite the then astonishing $300,000 price tag.
As one might imagine ongoing worldwide sales of working clinical units (EMI scanners as they were called) abruptly put an end to all other attempts to learn how to do computed axial tomography, but simultaneously launched an intensive, high powered battle to achieve commercially valuable improvements of the device which continue to this day. For nearly ten years, EMI deployed its patents to try to hold off potential competitors in court. It used a strategy of filing patent infringement litigation then offering settlements with sealed documents. In this way, each company that they sued had to start from scratch to try to assess the strength of the EMI patents, but EMI avoided the huge expense and unpredictability of full jury trials to assess its patent rights. Since back projection itself was not an invention and no unique algorithms were used at first, much of the patent was based on Hounsfield's findings that the X-ray beam itself could sensitively distinguish tissues when properly deployed at low intensity.
EMI rapidly advanced through four generations of scanners, steadily reducing scan time, reducing computing time and improving spatial resolution. It also started development of an MRI scanner project. However, in the early 1980's the scanner unit succumbed to the pressure of competition and litigation becoming a money losing activity - upon which it was sold off to the British company GEC (General Electric Company, plc). It's corporate remnants were later assembled with products from an American company called Picker and from Elscint - also scanner manufactures - to result in a division at GEC called Picker International which was later renamed Marconi. This unit was sold to Philips in 2001.
Continuing advances in CT scanning include dramatic advances in the speed of scanning and the use of simultaneous acquisition of as many as 128 image slices, all of which have improved CT's capabilities to stop motion like a fast camera. Together with advanced intravenous contrast agents, the detail and quality of CT angiography for coronary and cerebral vessels continues to advance. At a different extreme, small light mobile “O-arm” units have been developed that allow for real time CT scanning in the operating room.
The Physical Basis of Nuclear Magnetic Resonance
The history of MRI can be considered in three phases - the discovery of the fundamental physics and biological properties of nuclear magnetic resonance, the emergence of designs to accomplish imaging with MRI, and finally the emergence of neurologically optimized methods such as diffusion tensor tractography and functional MRI.
From a number of points of view, the very possibility of MRI at its outset and the most exciting destination of the technology in modern fMRI are embodied by its true grandfather Wolfgang Pauli (see figure 8), an extraordinarily talented and extraordinarily troubled Viennese physicist. He differs in many ways from the other scientists and inventors covered in this article but in no way more strikingly, than in the fact that he was utterly unconcerned with establishing priority for his work. He generally did not even bother to publish but just sent out his ideas in letters to his prominent friends and colleagues such as Werner Heisenberg and Neils Bohr. Despite the carelessness of documentation, we know more about him that about nearly any other scientist because - following a nervous breakdown after a divorce at age 31 in 1931 (no doubt precipitated in part by his intensive work leading to his discovery of the neutrino), he became a patient of Carl Jung who later published descriptions of more than 400 of Pauli's dreams.
Pauli's first publication was an article evaluating Einstein's theory of general relativity that he published at age 18. In fact Pauli's analyses of relativity were so well received that it was Albert Einstein himself who nominated Pauli for the Nobel prize he received in 1945. Pauli discovered many remarkable things about nature, its particles and their quantum behavior. Most importantly for MRI, noticing some irregularities in some spectra he was evaluating, he made the suggestion - in 1924 - that atomic nuclei should have magnetically related spins. He was correct in this and this is the physical basis of magnetic resonance upon which everything else in this field is established.
A number of physicists set out to test Pauli's ideas on nuclear magnetism deploying a variety of experimental devices and systems. Most important in this period of time was the success of Isidor Rabi in 1938 (see figure 8). Rabi - who won the Nobel Prize in 1944 - beat out numerous brilliant competitors by realizing how to design an experiment to detect and measure the magnetic spin of atomic nuclei. In an arrangement used by other nuclear physicists, a gaseous beam of nuclei of a given element was sent past a magnet which deflected the beam before it hit a detector. Rabi added an additional electromagnet whose field strength could be rapidly oscillated. In an inverse of how this is typically done today, the he was able to vary the strength of magnet. At a particular combination of field strength and frequency of the magnetic oscillator, the beam would abruptly begin to bend to a new deflection point. Rabi was using a tuned resonant frequency to pump electromagnetic energy into the protons. The particular mix of field strength and frequency varied from one element to another. He had proven the existence of magnetic spin, showed how to identify the “gyromagnetic ratio” of every element, and demonstrated the phenomenon of using varying fields to manipulate magnetic resonance.
An interesting historical note about another famous contest that Rabi won concerns the first atomic bomb explosion - the Trinity test at Los Alamos in July of 1945. Bets from the various physicists at site about the potential force that would result from the fission chain reaction ranged from dud to annihilation of the universe. Rabi predicted 18 kilotons of TNT coming very close to the measured 20 kiloton force that was actually recorded.
With the end of World War II later that year, physicists returned to peaceful pursuits and the first great result for MR came independently from Purcell at Stanford and Bloch at Harvard in 1946 (see figure 8). Each published their finding that the magnetic resonance effect that Rabi had observed in gases could also be detected in solid materials. Bloch filed a patent for the first NMR spectrometer in December of 1946 (granted in 1951). This opened the era of nuclear magnetic resonance study of a wide array of materials including biological specimens. Further work in Nuclear Magnetic Resonance by Hermann Carr - along with Purcell  together with modifications by Saul Meiboom and Gill led to the development of pulse sequences of radiofrequency and magnetic energy (CPMG sequences - for Carr, Purcell, Meiboom, Gill) that could identify different rates of magnetic field decay in a given type of element situated in various different chemical and physical surroundings.
An example of the kind of tasks that can be accomplished this way is the “spin echo” - a term that applies to what is done in the vast majority of modern MRI scans. This was conceived of, measured and proven by Erwin Hahn (see figure 8). When a radiofrequency pulse of the correct frequency is applied to a sample in a magnet, the selected protons will spin in phase with each other because they are all being driven by the same stimulating oscillating wave form. We then turn off the stimulating signal and listen to the emitted oscillating signal from the stimulated protons. As they all spin around together, they generate a signal that is strong as the tipped magnetic poles swing towards our antenna and weak as they spin away. This emission oscillates at the same resonant frequency at which the protons were stimulated. However, with the elapse of time, the signal decays away as the added energy from our stimulating pulse is dissipated. The signal - oscillating and steadily decaying away is called the FID (free induction delay).
Hahn had an ingenious idea to alter the way in which the spinning protons dissipate their introduced spin energy. Thinking of the spinning protons as spinning tops, imagine knocking them with the RF energy in such a way that instead of their axis of spin being vertical, it is actually horizontal with the foot of the spinning top resting on a vertical wire axis. In addition to spinning around its now horizontal axis, the top also “precesses” around the vertical wire.
As long as the axis of the spinning proton is horizontal it emits a strong signal. With dissipation of the pulsed-in energy, the angle of the axis slowly returns towards vertical. We can call the magnetic output of the proton when it is vertical the “longitudinal” magnetization and when it is horizontal we say there is also a “transverse magnetization.” As the orientation of the spinning proton returns to vertical - which is parallel with the main magnetic field of the magnet - the signal generated by the spinning transverse magnetization fades away - this is the T1 relaxation process.
Another aspect of the way in which the RF pulse tips the spin axes is that in addition to being horizontal, they are precessing around the wire coherently in phase with all the other surrounding protons. Because they are all in phase with each other as they precess around and as they spin, they join together to produce a strong coherent oscillating radiofrequency signal that we “hear/analyze” with our antenna once the stimulating pulse is turned off.
However, two other types of signal decay come into play as we consider this situation. Firstly, the precessing protons will interact with each others' magnetic fields and will disrupt each others' spin rates so that the spins gradually dephase from each other and the signal fades away. These are called “spin-spin” interactions and this type of signal decay is called the T2 relaxation decay. These random interactions will differ in quality from one tissue to another depending on how freely mobile the water protons are - fast in protein laden solutions, slow in water where the protons tumble freely.
In addition, there may be non-uniform aspects of the general environment. For instance, a blood vessel nearby will have some iron in deoxyhemoglobin and this will uniformly affect spins nearby causing more rapid loss of phase coherence and therefore signal loss. This is called T2* decay. In some types of imaging such as BOLD for functional MRI, we want to emphasize these effects. In most other types of imaging we want to suppress T2* effects as they may be irrelevant to the aspects of tissue anatomy we are interested in.
Erwin Hahn's idea - further updated by Carr and Purcell - was that there was a way to “refocus” the precessing protons to eliminate T2* effects (see figure 9). First, we deliver the RF pulse that flips the protons into horizontal configuration. This is called the 90 degree pulse. Then after a selected echo time interval, we apply a second 180 degree pulse that flips the axis into the opposite horizontal position.
Quite separately, return of the green arrow towards the vertical (not shown) would reflect the T1 relaxation (drawing by AG Filler - copyright: GDFL 1.3/CCASA 3.0, image source: http://en.wikipedia.org/wiki/File:Spin_Echo_Diagram.jpg).
In the initial 90 degree position, we can think of a single thick vector line representing the combined output from the all the vectors spinning at identical speed and phase representing the effects from all the stimulated protons in a sample. As these spins slowly lose coherence - some going a little faster, some a little slower - the single thick vector spreads out. Some component protons slow their precession and some actually speed up - all in response to the local magnetic environment - these T2* effects make the T2 decay occur rapidly. Strangely enough, with the 180 degree refocusing pulse, the spreading effect reverses itself. The spins that were spreading apart in their phases, begin slowly drifting back into phase with each other. We typically measure the T2 signal intensity at the point at which the refocusing is complete. This places the refocusing pulse exactly half way between the time of the 90 degree stimulating pulse and the echo time (TE). The TE is the point at which the frequency and phase gradients are deployed to read out the resulting signal strength. The removal of the T2* effects allows for the T2 decay itself to be observed over far longer decay times and so provides far more complex and subtle T2 contrast between and among various tissues.
Invention of Magnetic Resonance Imaging
With all of these pieces in place, the stage was set for the great drama of the invention of magnetic resonance imaging itself.
Throughout the 1950's and 1960's, NMR was used to test and evaluate a wide variety of substances and tissues. In a typical NMR machine, there is a small tube in the midst of the magnet and the material or tissue to be studied is placed in the test tube. In 1970 Raymond Damadian (see figure 10) - a research physician at the State University of New York (SUNY) Brooklyn campus, thought to measure T1 and T2 relaxation time on various tumors in comparison with related tissues. Damadian found that the T2 was longer in the tumors he studied by comparison with normal tissue. This finding, published in Science in 1971 electrified the magnetic resonance community because it suggested that there could be an important medical use for NMR in testing tissues for the presence of cancer.
As is well known today, the T2 decay rate of most cancers does not follow the behavior that Damadian observed - but nonetheless “a thousand ships were launched.” Damadian himself decided that his next step would be an enormous advance. He would progress directly from his measurement of a piece of excised tissue in a test tube to a project to immediately construct an NMR machine that was truly enormous by the test tube standards of that era - big enough for a living person to stand and move around inside the machine (see figure 11). This is what is described in his 1972 patent filing (granted in 1974).
He conceived of a means to do NMR tissue measurements in a vertical column of uniform magnetic field (a few centimeters in diameter) in the center of the magnet. The radiofrequency emitter and detector would then spiral their way down, measuring again and again as they progressed from the top of the head to the foot. Only the vertical column at the very center of the person would have just the right homogeneous magnetic field strength for the magnetic resonance testing to occur. Then, the person would move a little bit so that the central magnetic field would pass through a different vertical column of the body and the spiral process would be repeated. In this way, measurements would be taken of all parts of the body that would allow the machine to detect an anomalous T2 signal that could indicate cancer and would allow the physician to know approximately where in the body to look for it.
There were many problems with this device. First and foremost, it was not actually capable of making an image. Secondly, the T2 phenomenon would only identify a small fraction of all the possible tumor types, the rest remaining undetected. The transmit and receive device he postulated would not provide a “beam” of RF energy as he proposed since the radiowaves are broadcast and then received from the antenna along a wide area. This machine did not work and was never used. It also differed from the step by step magnificent precise and triumphant theoretical experimental physics of earlier workers - instead it was large crude, hypothetical, irrational in many ways and took a giant leap without working through the necessary steps along the way.
Nobel Prize Controversy
When the Nobel Prize for invention of MRI scanning was announced in 2003, Damadian was snubbed and the award went to two more traditional scientists, Paul Lauterbur and Peter Mansfield. Damadian is a creationist so he accepts magical and divine intervention in biology. That has made him an intellectual martyr for the creation science crowd. Nonetheless, his omission from the Nobel Prize is a Rohrsach test meaning different things to different observers. The Prize committee, despite an intensive effort by Damadian and a wide array of supporters - held to their position. In a reverse answer to a question Damadian asked in one of his newspaper ads - they believed that MRI would have been developed by Lauterbur with or without Damadian's contribution and that Lauterbur would have accomplished it no sooner and no later. They did not accept the possibility that the reverse of this premise might in fact be correct.
Damadian's Patent Litigation
Damadian was ultimately able to enforce one of the later patents from his company, Fonar, that covered oblique angle imaging. However, his original patent faced many difficulties, - when a jury awarded Damadian a 2.2 million dollar settlement for patent infringement against a subsidiary of Johnson and Johnson, the judge threw out the verdict. When a jury awarded him more than $100 million in his patent infringement lawsuit against GE for both MRI and for oblique angle imaging, Judge Wexler threw out the entire award leading to a complex appeal process.
Damadian won the appeal and was ultimately able to collect damages for patent infringement from GE and from all the other MRI manufacturers for the oblique angle software feature. This result has sometimes been misstated as an action by the US Supreme Court that vindicates his claim to invention of MRI scanning. In fact, the US Supreme Court did decline to review Damadian's success in the appeal that reinstated the jury verdict against GE however the details of the decision (by the United States Court of Appeals for the Federal Circuit 96-1075,-1106,-1091 under judges Lourie, Skelton, and Rader) warrant a close reading.
In the successful appeal, the court did deal with Damadian's original patent and found infringement because the GE scanner also distinguished the T2 decay time of cancerous tissue not because of the issue of imaging. In addition, the appeal judge ruled that the grey scale images that the GE scanner produced were equivalent to numerical comparisons of the T2 values of selected tissues that were produced by the Damadian machine. However, there is no support in the judicial decision for the assertion that the Damadian machine produced an image. For these reasons, Damadian has a valid enforceable patent that is infringed by all MRI scanners, but his assertion that the US Supreme Court decided that he invented MR imaging is not correct.
Lauterbur and the Technical Basis of MR Imaging
So what is it that Lauterbur and Mansfield did that led to sharing the Nobel Prize for the invention of MRI? It is really Paul Lauterbur who had the transforming idea that makes magnetic resonance into a viable imaging method.
Like Damadian, Lauterbur (see figure 10) was a professor at the State University of New York (SUNY) but at its northern Long Island location at Stonybrook. In addition he was the CEO of a small company - NMR Specialties - that manufactured and operated NMR equipment. Partly as a result of Damadian's publication about the increased T2 time of tumors, Lauterbur had been forced to run NMR analyses of pieces of rats that he had to put into test tubes. Damadian was a physician but Lauterbur was a physicist who was generally sickened by the specimens that were starting to arrive. After one grisly day, he sat at a hamburger restaurant (a Big Boy to be precise) trying to get his appetite back, and searching through his mind for any possible ways he could think of that would let him measure the NMR data on intact animals. He needed a way to focus the experiment on a single location inside the animal. An answer occurred to him and almost immediately he realized that his method would allow individual locatable measurements of any point in the animal and that these could be reconstructed into images like tomograms. He jotted it all down on a napkin, then rushed out to buy a notebook where he could write out the idea in more detail to get it dated and witnessed (September 2, 1971) for a patent filing.
Lauterbur filed a preliminary patent disclosure but as the 12 month point arrived when he would need to spend money to file the actual patent, he received advice from all sides that magnetic resonance imaging had no imaginable commercial use. He allowed the deadline to pass without filing, publishing the method in Nature (after successfully appealing an initial rejection by an editor who felt this would be of limited specialist interest only).
Lauterbur's idea was to use magnetic gradients to assign a different magnetic field strength to each point in a subject volume. This idea was based on the gyromagnetic ratios that Isidor Rabi had first measured more than thirty years earlier. Essentially, for protons for example, at 4.7 Tesla, the resonant frequency for the protons (hydrogen atoms) in water is 200 megahertz. If you apply a magnetic field gradient across a specimen then (using approximate illustrative numbers) on the left the field strength will be 4.701 Tesla and on the right it will be 4.699 Tesla. The proton resonant frequency on the left will now be 200.01 MHz and the frequency on the right will be 199.99 MHz.
In this fashion, and by applying gradients in three different directions (X, Y and Z) you can assign a unique field strength to each location (voxel) in the sample volume so that each location in the object being imaged produces a signal at its own unique identifiable radio frequency. You can adjust your radio dial for the receiving antenna and for each frequency you select you can check on a T1 or T2 measurement experiment in just that voxel. By running the experiment hundreds of times, once for each voxel, you can determine the T1 or T2 for each voxel, know all the locations, and generate an image showing the T1 and T2 intensities as grey scale pixels in an image.
For Lauterbur's initial design he read out a line of the volume to produce an output very much like the mechanical back projection data described earlier in this paper for Gabriel Frank's non-computed axial tomogram. Once Lauterbur's device had collected data for all the lines for an image slice, he could run them through a computed back projection algorithm and voila - an MRI axial tomographic image emerged.
Ernst and Edelstein Complete the Paradigm
A few years later, in 1975, Richard Ernst filed a patent (granted in 1978) showing how a group of voxel data sets could be collected simultaneously as a complex mix of frequency spectra. Then a Fourier transform could be applied to extract the different frequency component information elements. This is really the fundamental completion of our modern magnetic resonance imaging paradigm - a complex array of magnetic field gradients to spatially encode each voxel in an image by its unique frequency and then a Fourier transform to sort it all out into a series of signal strengths (each depicted as a relative brightness on a grey scale) to generate an image based on the voxel signal data.
Fourier transforms had been used for a hundred years in the study of radiowave data and had been deployed in the evaluation of NMR spectra since the 1950's. This is a mathematical approach that dates to work by Jean Baptiste Joseph Fourier in the early 1880's that can be used to convert a “time domain” oscillating signal into a “frequency domain” description of the content of the signal. It was Ernst's insight to use this classical method from NMR analysis in order to resolve the complex information arising from an MR image data set.
Another improvement came from Bill Edelstein in 1980 who showed that a pulsed gradient he called a “spin warp” could be applied that would result in an array of positional encoding by the phase that was far more efficient and usable than the frequency encoding system that Richard Ernst had described. Essentially, with the gradient applied briefly, spins on one side that had a higher magnetic field strength would speed up and the ones on the other side would slow down. When the gradient is turned off, they all resume the same speed. However, the spins that had sped up are out of phase with the ones that slowed down. If you listen/analyze for the early phase info, you will be getting information from one side of the subject, if you listen/analyze for late phase info, it will be coming from the other side.
In practice, the three types of gradients are used as follows. The X-gradient along the length of the magnet (head to toe in a cylindrical magnet) is turned on and we provide “slice selection” by doing the RF stimulation with a range of frequencies that work at just one region of the gradient at a time. To move towards the closer end of the magnet with the higher field strength we stimulate with a higher frequency, to move towards the far end we stimulate with a lower frequency. The stimulation frequency activates spins in a slab that is the image slice. By using a very narrow band of frequencies we get a thin slice, while a wider range of frequencies results in a wider slice. Areas of the subject outside of the selected slice will not be stimulated.
Now, to get the two-dimensional information out of the slice, we use the Y-axis and the Z-axis gradients. For the Y-axis (frequency encoding) we apply the gradient from right to left across the magnet. The entire slab has already been stimulated by the X-direction slice select gradient, so we now want to manipulate those spins to get the data from each location in the slab. The Y-gradient is applied and kept on so that frequencies will be higher on the left, lower on the right. This allows us to distinguish data coming from a tall column of the subject's left side, distinct from a series of neighboring columns. The column with the lowest frequency will be on the right.
Then, we apply the Z-gradient briefly to get each column labeled top to bottom by the phase differences mentioned earlier. Now we have a unique access to each voxel of the subject. The X-gradient selected the slice/slab by activating it, the Y-gradient applied frequency encoding information identifying the positional source of the signal within the slab from left to right. The Z-gradient applied phase encoding top to bottom. Edelstein's improvement was to use various strengths of gradient in a fixed time as opposed to Ernst's method of applying a gradient of uniform strength for various lengths of time. Edelstein's spin warp was much easier to accommodate in a pulse sequence.
Finally, we turn the antenna on and make a recording of the complex mix of signals coming from our slab. This data is run through a two dimensional Fourier transform and the output is an image slice. If we split the frequency codes into 128 separate bins and the phase codes into 128 bins, we have an image with 128 x 128 voxels. In each voxel, the image intensity will be determined by the impact of the pulse sequence applied during the image session and the results of various decay effects (T1, T2, or others) that cause some voxels to lose signal faster than others. In a T2 weighted image, for instance, voxels in the middle of a brain ventricle will have strong bright signal because of the freely tumbling water molecules of CSF. Voxels in the skull will have little signal at all because the water (proton) content is lower and there is very little movement.
If we want to collect all the information from an entire slab there are two approaches. One is to use the slice select gradient to activate the slab and then use the Y-direction frequency gradient to leave only one column of the slab in an appropriate field strength to remain activated by the pulse. We then use the phase encode gradient to read out the signal from the various different vertically distributed locations along the column. We then repeat this 128 times, gradually working our way across the slab from left to right. If each event of RF stimulation, spatial encoding and readout of a column takes 100 milliseconds, we will have all the data for the slice collected after 12.8 seconds. If the slices are 4 mm thick with a 1mm blank space between them, we can get through a 15 centimeter volume with 30 slices. This will take about six and half minutes.
Peter Mansfield pointed out that it would be possible to rapidly switch the gradients so that the entire slab volume could be sampled with a single acquisition. This is called “echo planar” imaging (EPI). In this fashion the entire slice is imaged in 100 milliseconds and the whole scan is completed after three seconds. This sort of very fast imaging is critical for “stop motion” studies such as cardiac imaging. It is also very important for studies such as “diffusion tensor imaging” (discussed below) in which each image is really composed out of at least 7 and up to 256 or more image repetitions to be complete. One can readily see that 100 repetitions at six minutes each is completely outside the range of feasibility, but 100 repetitions at 3 seconds each is going to be just 5 minutes - the same general length as a non-EPI standard scan.
Diffusion Tensor Imaging (DTI) and Diffusion Anisotropy Imaging (DAI)
The broader field of Diffusion Anisotropy Imaging includes what is widely known a Diffusion Tensor Imaging (DTI), tractography based on this (diffusion tensor tractography or DTT) as well as other advanced methods for following neural tracts such as Q-ball and HARDI (High Angular Resolution Diffusion Imaging) which do not deploy the classic tensor mathematical model. It also incorporates lower order non-tensor methods in which three gradient axes are sampled to minimize anisotropic effects where they occur in relatively isotropic tissue such as gray matter of brain and spinal cord.
Understanding how to assess diffusion in solids and liquids has a long history extending back into the 1700's. Among the most fundamental investigations of the process of diffusion are the studies by Thomas Graham (see figure 12) in the early 1800's. He initiated the quantitative analysis of diffusive processes through his work with intermingling gases and with salts in solution carried out at what is now Strathclyde University in Glasgow in the late 1820's and early 1830's. Another important element of understanding came from the botanist Robert Brown who was the first to fully describe and name the cell nucleus. In 1827, he reported observations in his microscope that very small (6 micron) granules derived from pollen grains from a wildflower -
A fascinating experiment that introduced the concept of using ellipsoids to describe diffusion was published by French mineralogist and physician Henri Hureau de Sénarmont in 1848 (see figure 12). He applied wax to the cut polished surface of a crystalline material. He then applied heat to the center of the structure with a heated piece of metal. The heat diffused through the crystal and melted the wax around a progressively expanding front moving centripetally away from the heat source. In materials in which there was uniform diffusion in all directions (isotropic) - the melting edge would spread as a circle. If the crystal structure contained preferred axes of mobility - the heat would spread more quickly along some directions than others. The result was a growing ellipse on the wax coated surface.
A few years later in 1855, Adolf Eugen Fick published his insights that provided a mathematical basis for describing diffusion. Most importantly, he showed that much of the mathematics that had been developed by Joseph Fourier and other to describe thermal processes could be readily applied to diffusion.
The origin of tensor mathematics was a sudden event that occurred on the evening of October 16, 1843 as Sir William Rowan Hamilton (see figure 13) was walking with his wife near the Broom Bridge on the Royal Canal in Dublin. He was trying to imagine ways of describing complex numbers above the level of two dimensions. He abruptly realized a method to accomplish a description in four dimensions. Fearing he would forget and having no pen and paper, he drew a pen-knife from his pocket and carved the fundamental equation into the stone of the bridge. Hamilton's math is called “quaternions” and he explicitly imagined it as dealing with the three dimensions of space and the fourth dimension of time. The notation and concepts an proponents of quaternions then came into conflict with proponents of vector math and its notations among mathematicians and scientists. Gradually, vector math came to dominate in many areas, but the mathematical descendants of the quaternion have remained important as well.
More than fifty years later, in the closing years of the 19th century, Woldemar Voigt expanded Hamilton's usage of the word “tensor” into its modern sense by applying it in his studies of the physics of crystals. Gregorio Ricci-Curbastro, (see figure 13) in the process of developing differential calculus with his student Tullio Levi-Civita used the term “tensor” to describe an updated version of Hamilton's quaternions, and developed a fully worked tensor calculus. Their work was read by Albert Einstein who then began to consider diffusion and tensor mathematics. Albert Einstein significantly advanced the mathematical development of tensors in his work on general relativity, using tensors to describe transformations in space and time.
In the early 1950's, Erwin Hahn as well as Herman Carr & Edward Purcell pointed out that it was possible to consider an additional form of NMR signal decay - other than T1 and T2 - based on diffusion. The idea was to turn on a magnetic field gradient during the measurement process. Recall that at this time, long before MR imaging was invented, there were no field gradients used for position. It was Lauterbur who borrowed from the idea of diffusion gradients to conceive of the positional gradients now used for MR imaging. Herman Carr points out that in his 1952 Harvard PhD thesis, he was the first to report the idea of using the diffusion gradient to encode spatial information - at least along a single axis.
Initially, diffusion was thought of as an artifact that could cause signal decay that was not truly due to T1 or T2 effects as well as a phenomenon of interest in its own right. If there was relatively little diffusion of the molecules that held the protons being measured, then the protons would remain in the area of strong uniform magnetic field strength. However, if the molecules tended to diffuse isotropically in all directions, then they would move to positions of different magnetic field strength and would rapidly dephase and lose signal. Hahn used his idea of the spin echo generated from a refocusing second RF pulse in order to remove the effects of diffusion from the T2 signal. Carr and Purcell more explicitly pointed out not only how to perfect the refocusing pulse, but also how to make quantitative measurements of diffusion. Hence, NMR could measure rates of diffusion under various conditions and with various elements and molecules.
Donald Woessner - then a chemist working at Mobil - was the first to extensively consider and investigate the use of NMR in a setting of restricted diffusion. He pointed out in an article published in June of 1963 that if there were barriers to the free diffusion of molecules, then the apparent diffusion coefficient would be decreased due to the physical barriers. The parameters of the measurement could be adjusted based on the typical space between barriers to bring out the effect. This method showed some promise for measuring the free diffusion space inside some porous structures. Woessner appreciated that the existing gradient diffusion method was clumsy to use at small scale and introduced the idea of using two pulsed diffusion gradients - set at different time intervals - to determine the size of the compartment in which the diffusion was taking place.[50, 51]
Edward Stejskal (see figure 14) was a 30 year old assistant professor in the Department of Chemistry at the University of Wisconsin in Madison when John Tanner joined his group as a graduate student in 1962. Tanner, who was actually two years older than his professor, had been working at a small technology firm in Madison after finishing his Masters degree in 1954. Although Stejskal's focus was NMR, it was Tanner that introduced the diffusion issue to the lab. He had been working on fluid viscosity in gels at the technology firm and had the idea of doing a PhD focused on trying to use NMR diffusion methods to clarify the behavior of fluids in this situation. Stejskal was aware of the use of diffusion in NMR and decided to green light Tanner's project. However, after 18 months, Tanner was making very little progress - and not for lack of trying.
The diffusion gradient methods Tanner started with dated back to early observations by Hahn and by Carr & Purcell . The problem Tanner was having is that the water in gels diffuses slowly and it was requiring progressively larger gradient to try to detect an effect. The gradients required were at the limit of what was possible and there were effects of the gradients that were swamping out the diffusion information.
Stejskal tried to imagine theoretical approaches to solve the problem. Then, for reasons he cannot explain, shortly before midnight on May 1st of 1963, he suddenly conceived the solution - two pulsed gradients rather than continuous application of a single gradient. He jotted the idea down on the margin of an equipment logbook, left a note for Tanner and left the lab around 1am. The next day, Tanner abandoned the approaches he had been trying and set to work immediately to try to get the apparatus to generate the pulses to run the experiment. This succeeded and led to their very widely cited 1965 publication. The Stejskal-Tanner method is still the workhorse of all diffusion imaging 45 years later.
At the time there wasn't much interest in this. Both Stejskal and Tanner moved on to other areas. Stejskal also points out that standard NMR equipment didn't handle the pulses well. Years later, as the equipment capabilities in NMR caught up, interest resumed. The introduction of their method into MR imaging by Michael Moseley in 1984[53, 54] laid the ground work for the explosion of interest in diffusion imaging caused by Moseley's subsequent finding of diffusion MRI's utility in early detection of ischemic stroke in 1990.[55-58]
The idea of using two gradient pulses is a transformation of the ideas that Hahn and Carr & Purcell had applied to RF pulses. The Stejskal & Tanner idea was to pulse the diffusion magnetic field gradient on for only a brief period and then to do this a second time after a carefully selected interval. The two pulses are placed symmetrically before and after the 180 degree refocusing spin echo pulse. This has the effect of amplifying the diffusion sensitivity since it removes the T2* effects of the gradient pulses, leaving just the impact of physical repositioning of the protons due to diffusion. Effectively, the first pulsed gradient causes dephasing, then, after the 180 degree pulse, the second pulsed gradient reverses and eliminates the dephasing - but only does so for those protons still at the same position in the gradient.
The time interval between the two pulses also sets the rate of diffusion that is being sampled - if the two are fired very close together, only fast diffusing molecules will be affected. When the time between them is relatively large, then even slowly diffusing molecules will be affected.
Stejskal apparently was not aware of Woessner's work at the time of his initial idea in May of 1963. However, the Stejskal solution was more effective because it placed the two pulsed gradients on either side of a refocusing pulse - just as Hahn had done with the spin echo.
Even more importantly for future applications, Stejskal fully considered the implications of this advance for exploring diffusion in all its aspects. Prophetically, Stejskal appreciated the basic features of applying pulsed gradients to study diffusion in anisotropic media using the tensor ellipsoid model. Drawing upon the classical work of Carslaw & Jaeger in heat diffusion, he pointed out in a second paper published in 1965 that NMR measurements of anisotropic diffusion should be oriented along the principal axis of a tensor ellipsoid. This is almost the exact idea that Peter Basser & Dennis LeBihan believed they had discovered at the time of their patent filing nearly 30 years later.[61, 62] In the file history of their patent examination, they incorrectly - but successfully - asserted that no one had ever measured this diffusion tensor for the translational (bulk random-walk movement) self-diffusion of water and that was the basis upon which their patent was granted.
In the 1970's other researchers such as Blinc pointed out that by rotating an anisotropic specimen relative to the direction of measurement, a number of different values for the translational diffusion coefficient could be obtained and that these could be used to accurately fill in the diagonal and off-diagonal elements of the diffusion tensor. This step had the practical use of making it possible to determine the orientation of the true parallel and true perpendicular orientations for accurately measuring the relative amount and direction of anisotropy within a sample.
By this point, it was clear that various NMR scientists had considered what would happen if the structure they were measuring had a strong axis of anisotropy. If they placed the structure so that its axis of anisotropy was perpendicular to the direction of the gradient, relatively little decay took place because the diffusing components tended to stay in an area of similar signal strength as they diffused. Similarly there was an increased rate of decay if the axis of anisotropy was parallel to the direction of the gradient. With this theoretical basis in hand, several groups began exploring the detailed behavior of water diffusion in muscle cells in order to show that cell shape could be measured by its effect on water diffusion.[64, 65]
Diffusion Weighted Imaging
In 1984, Michael Moseley (see figure 14) initiated the field of diffusion imaging by inserting the Stejskal-Tanner pulsed gradient into an imaging sequence to assess the diffusion coefficient in structures seen in an MR image.[53, 54] Two years later, Le Bihan reported diffusion coefficients from various normal and pathologic tissues following Moseley's method. The most important clinical discovery in diffusion weighted imaging was Moseley's finding published in 1990 that diffusion weighted imaging could detect the effect of acute stroke. Prior to this time, both CT and MRI were relatively ineffective for determining if a patient had an ischemic stroke. The impact of Moseley's finding was analogous to Damadian's discovery 20 years earlier that tumors could have different T2 relaxation properties when compared to their parent normal tissues. Moseley's finding caused an explosion of interest in diffusion MRI so that in short order, diffusion weighted imaging was being applied in tens of thousands of clinical images throughout the world.
Michael Eugene Moseley started his academic career in the Department of Physical Chemistry at the University of Uppsala. He worked with Peter Stilbs - then just two years out from completing his own PhD. Moseley published his first papers on NMR spectroscopy with Nitrogen in 1978. His University of Uppsala PhD Thesis, submitted in 1980 covered solvent and polymer dynamics in polystyrene solutions, so he will have encountered NMR diffusion problems similar to the one that John Tanner was struggling with when Tanner joined Stejskal's lab fifteen years earlier. After leaving Sweden, Moseley did a post-doc at the Weizmann Institute in Rehovot, Israel (where Saul Mieboom had done the work leading to the CPMG pulse paradigm many years earlier). Moseley's project as a post-doc involved the use of the Stejskal-Tanner pulse sequence to study the anisotropic diffusion of methane and chloroform in smectic liquid crystals.
From there, Moseley moved to California, joining the Department of Radiology at UCSF in 1982. At this point he shifted focus from inorganic chemistry and ultimately applied his classical training in NMR with his recent experience in anisotropic diffusion in crystals to the new field of MR imaging.[53-55, 70, 71] He went on to revolutionize the field with his insights and discoveries in the application and use of NMR diffusion methods to solve important clinical problems in medical imaging. He has recently served as the President of the International Society for Magnetic Resonance in Medicine - the leading academic research society focused on MR, a group that has tens of thousands of member from the MR research and clinical community.
Origins of diffusion tensor imaging
The initial diffusion weighed imaging studies quickly revealed that there was a troubling aspect of the use of diffusion for image contrast - Moseley reported at a 1989 meeting of the Society for Magnetic Resonance in Medicine (SMRM) that the image intensity of white matter areas varied in their diffusion contrast appearance depending upon the relative angle between the diffusion gradient and the long axis of the fiber tract. Further details of his findings were presented at a workshop in Bethesda, Maryland in June of 1990. This complicated the utility of diffusion MRI for identifying stroke in white mater regions, but Moseley also appreciated that there was an unanticipated potential new opportunity for MRI in this as well. Both Moseley's group at UCSF and a group at the Hammersmith Hospital in London published papers later that year showing that by taking images with one gradient parallel and one gradient perpendicular to known tracts, that a significant difference in intensity could be observed. Radiologists thrive on the discovery of new forms of tissue contrast, and this finding of contrast from diffusion anisotropy generated tremendous interest and anticipation.
One solution to the imaging problem of producing a single valid image that correctly depicted the anisotropy in each voxel came from studies of plant tissues. Paul Callaghan and his associates carried out NMR imaging of a thin cross section of a wheat grain. They rotated the sample, collecting images at each 2 degrees of rotation and then carried out a filtered back projection algorithm - like Gabriel Frank, and Godfrey Hounsfield had done - to generate a cross sectional image.
However in a July 1992 patent filing, Filler and his associates revealed a series of critical aspects of diffusion anisotropy imaging that preceded other groups by several years and which became the basis for modern diffusion tensor imaging. The key elements were initially made public in an abstract at the August 1992 SMRM (Society for Magnetic Resonance in Medicine) meeting in Berlin. However, the field of diffusion tensor imaging and tractography were truly launched when Michael Moseley again presented the findings, methods and images from the Filler group to a packed plenary session of more than 700 MR scientists at the SMRI meeting in March of 1993. As Moseley wrote to Filler later “...your slides were of course an instant hit...” It was an electrifying scientific moment - numerous projects in the development of tractography were launched that day.(personal communication: Michael E. Moseley, April 8, 1993, http://www.neurography.com/neurography-1993moseleyletter.pdf, by permission of Michael Moseley).
The most important idea is that instead of each voxel having an image intensity for a 2D image, each voxel should instead contain an arrow with a specific length and direction in the three dimensional space of the voxel. From the length of the arrow we can learn about the anisotropic diffusion coefficient for the voxel. From the direction in space, we can learn about the dominant direction of neural fiber tracts within the volume.
The patient being imaged does not need to be rotated, rather, the diffusion gradients can be applied from many different directions by mixing inputs from the standard three gradients. Several different images are acquired, but these are then combined via vector or tensor math, to result in a single image that is “rotationally invariant” - the image intensities - based on the anisotropic diffusion coefficient is a single true value instead of being different in each of a series of images obtained from various angles.
The use of a tensor formalism in NMR of diffusion had been well known for decades, but this concept of generating a single calculated image made up of complexly data-laden voxels capable of generating neural tract traces was entirely new. Instead of being flattened into a pixel of data with a gray scale of 1 to 256, each voxel would be a “container” that could hold complex spatial information that could be used in various computational methods to demonstrate various aspects of the physiology and pathology of a tissue. Voxels could be associated with each other across three dimensional space based on similarity of axonal orientation.
The Filler patent presents both a simple geometric method using arctangents with input from three gradient directions and also points out that with diffusion gradients activated in more than three directions, a diffusion tensor may be calculated. It then goes on to show various ways to generate tractographic images (see figure 15).
It should be noted that patent went well beyond simply demonstrating voxel orientations in an image slice. The idea was selective, progressive tract tracing. The radiologist could select a seed and destination, then learn what tracts progressed from the start point to the end point - an MRI version of classical tract tracing techniques.
Among the most important clinical findings reported in the 1992 patent filing, was the discovery that in an encephalitis model, there were some pathologies that were best detected by alterations in the tractographic data. This meant that DTI could detect white matter pathology that could not be seen with any other MRI method. This finding was analogous to Damadian's finding on tumor T2's or Moseley's discovery that diffusion coefficients changed in stroke. It is the basis for the current vast literature in which DTI is used for early detection of Alzheimer's,[77, 78] Parkinsons,[79, 80] diffuse axonal injury in head trauma and in numerous other clinical applications. It would be nearly five years before any other group reported the use of vector or tensor methods to assess pathologies in the living brain that could only be detected with this technique.
In the vector/arctangent model, if diffusion data were collected in three different orthogonal directions, then a vector could be calculated. The length of the vector calculated from data on the three main axes would show a close estimate of the real diffusion coefficient for an anisotropic voxel, independent of the orientation of the gradients relative to the direction of anisotropy. This type of measurement is similar to what is now called the diffusion trace or “Fractional anisotropy” or FA .
This approach of using vector length rather than a single axis diffusion acquisition is one of several similar methods for calculating a composite result for the diffusion coefficient of a voxel so that the result is independent of the angle of the gradients.[62, 76] This strategy now also dominates standard diffusion weighted MRI for stroke.[83, 84] This is because grey matter is not truly “isotropic” and strokes involve both grey and white matter. By collecting gradient information in three axes and using vector or tensor math to calculate the true - directionally independent - measure of diffusion, the artifacts that arise from single direction information can be eliminated. In some sequences with short echo times (reduced T2 weighting), all three gradients can be activated simultaneously so that no calculation is required. This is also the approach now being used to apply diffusion imaging to functional MRI (see below).[86, 87]
The diffusion tensor concept had been very well worked out in other fields several decades earlier. One of the most important applications of diffusion tensor theory in magnetic resonance before 1992 was in the analysis of spinel crystals such as those being developed as ferrite-type magnetic resonance contrast agents.[88-90] The diffusion tensor theoretically requires data from at least six different directions although in practice, the three major or diagonal elements of the 3x3 matrix that describes the tensor will provide most of the needed information. It is clear that at the time LeBihan wrote his 1991 review of diffusion imaging as well as other papers that year,[92, 93] that the major thrust is to obtain just the x, y and z directions as independent data elements. In the 1991 review paper, LeBihan cites the 1960 edition of Jost's textbook on diffusion in which the mathematics of the diffusion tensor and the ellipsoid model are discussed in context of the 110 years of work in these fields (1848 to 1960). Nonetheless, - aside from the information in the Filler et al 1992 patent filing, no MR researcher actually reported having calculated such a true anisotropic diffusion coefficient in a brain imaging situation until several years later.
In August of 1993, Peter Basser and Denis LeBihan (see figure 16) filed a patent application based on applying the ellipsoid model of diffusion - with a filing date one year after their presentations at the SMRM meeting in Berlin in 1992.[96, 97] Denis LeBihan had been involved in pioneering work in diffusion MRI for a number of years at that point and had filed a patent to do with the study of intra-voxel incoherent motion of water. Peter Basser was filing patents about strain gauges in which the mathematics of the strain tensor was employed. Basser learned of Le Bihan's work when he wandered in to a poster presentation in a tent in a parking lot at NIH in the fall of 1991. He immediately became very excited about the potential for deploying tensor math to solve the problem of the need to have a rotationally invariant method of processing the data. Although the two signed a “disclosure document” at that time with an eye towards a future patent filing, Basser reports that he became dejected when Denis Le Bihan pointed out that although this was a nice idea, neither of them knew how to actually measure the tensor.
However, with the impending deadline of the abstract submission date on March 6, 1992, for the August SMRM meeting in Berlin, and working with James Mattiello, they worked out the concepts. Somewhat ignominiously, they obtained a pork loin which they rotated around in an MRI scanner as they collected diffusion image data. The numbers apparently were fed into software such as MathTensor that had recently become available to run with Steven Wolfram's Mathematica 2.0 software. The results showed that as they rotated the pork loin, the ellipse constructed in the software rotated in the reference frame. This is what they published in the 1992 abstract.
In 1994, Basser and LeBihan published an initial summary article on their ellipsoid tensor model and it is this paper that is mostly widely cited as the original paper on diffusion tensor imaging. This is indeed a fundamental paper that provides a rigorous mathematical basis for tensor based diffusion anisotropy imaging. In 1995, Basser pointed out the potential to calculate the fractional anisotropy number and the following year the first actual data of this sort was published by Pierpaoli et al, four years after the Filler et al patent filing and the meeting abstract by Todd Richards - one of the co-inventors on the Filler et al patent.
Conflict between DTI inventor groups
For various unclear historical reasons, the publications by the group of inventors in the Filler patent as well as the reports at the principal magnetic resonance research meeting went unheeded by virtually all other researchers in the field for several years. In part, this appears to have occurred because of Peter Basser and Denis LeBihan at NIH held the attention of the MR community through their vigorous program of publication and reporting on the development of the technique. Basser and LeBihan published steadily in this field reporting increasingly complex math without showing experimental results.[61, 100] Eventually, as Basser told an interviewer he became concerned at how few MR scientists were entering this field and decided he must “dumb down” diffusion MR if he expected any other group to follow.
Many academics are unfamiliar with the process of patent submission and evaluation so some explanation helps clarify what happened with these two patents - US 5,560,360 from the Filler group and US 5,539,310 from the Basser group. The laws have changed over time and they differed significantly at that time for inventors working in Europe versus those working in the United States. In Europe, once a discovery or invention has been publicly disclosed - even verbally at a meeting presentation - it can no longer be patented. However, in the U.S. an inventor was allowed one full year from the date of disclosure before having to file a patent application. In the U.S., if there is a dispute over the priority of two patents - who invented first - then one can look to signed and witnessed notes to find a date of conception - however the US Patent Office will not recognize any such documents if they are not prepared in the geographical United States.
Once the initial applications are filed, the inventors are allowed one year to update or add to or change the contents before the final application with all legal “claims” attached must be submitted. This document is then usually published by the World Intellectual Property Organization (WIPO) as a “Patent Cooperation Treaty” or PCT document within 18 months of the original earliest filing date. This PCT version gets an initial search of the literature for competing published prior art that might invalidate it. The inventors are required to turn in any prior art they are aware of. The inventors then send the PCT document out to different jurisdictions (e.g. United States, Japan, Europe, Australia, Canada) with appropriate translations where each goes about its own process of patent examination for non-obviousness, validity and novelty. Various objections and rejections are raised by the examiners, the applicants reply, and if there is agreement, an amended version of the patent is accepted and published by each of the jurisdictions as it finishes it's process. Patent examination can take 1 to 12 years - or longer!
In the case of these two patents, Filler et al started to file in March of 1992 and had a series of “priority documents” up to July 31 of 1992 containing the inventive material - including a discussion of tensors and numerous orientations of the gradients - and then filed the final application in March of 1993 upon which it was published as a PCT in September of 1993. Like the Filler group, the Basser group presented papers at the August, 1992 Berlin meeting of the Society for Magnetic Resonance in Medicine, emphasizing the mathematics but not including any actual images.[96, 97] The Basser group then filed their initial application 12 months later in August of 1993, filed their final draft in August of 1994 and had their PCT publication in February of 1994. Both patents were granted and published in the United States in 1996, apparently without the relevant examiners being aware of each others work.
When Michael Moseley requested the images from Filler and Richards and re-presented them in the plenary session at the 11th Annual Meeting of the SMRI (Society for Magnetic Resonance Imaging) in San Francisco on March 28, of 1993 - the session was moderated by Denis Le Bihan. This was five months before Le Bihan filed his patent for diffusion tensor imaging.
The patent by Filler et al[76, 101] was granted in the US and some of the initial reports were published in the Lancet and reported in the New York Times, CNN and ABC news. Nonetheless, Basser and LeBihan apparently remained unaware or at least unwilling to acknowledge by reference. Even after the Filler patent was cited 32 times in an exchange between the US Patent Office and Peter Basser in 1999 (see below), Basser and LeBihan both continued in never referencing any of that work in numerous publications to the present day - despite submitting more than 150 clinical and historical publications and book chapters on the subject since that time.
The Basser, Mattielo, and LeBihan patent is very narrowly focused on using an NMR or MRI system to fill the six matrix components of a diffusion tensor ellipsoid model. The biggest problem it faced in the patent examination was a series of comparisons to a patent issued in 1984 to Wilfried Bergmann. In that patent, Bergmann proposed an MRI scanner in which the transmit and receive coils were superconducting. He argued that this would increase the precision of the system for measuring T1, T2, and the diffusion tensor. He also provided superconducting coils for generating “three dimensional pulsed field gradients” to measure the diffusion tensor.
The patent examiner initially rejected all of the claims by Basser saying that Bergmann had already invented a method of using MRI to measure the diffusion tensor. Basser replies by arguing that Bergmann must be talking about the tensor of magnetic spins. The examiner again rejects all the claims saying, no - it is unmistakable that Bergmann is talking about using pulsed gradients to measure the diffusion tensor. Basser replies that this must then be the rotational diffusion tensor (a means of using NMR to study the rotations of molecules) rather than water displacement. To support this, Basser points out that when Bergmann gives a reference to a textbook by Farrar and Becker to support the methodology for measuring the tensor, that the text only covers rotational diffusion. The examiner, Raymond Mah, again rejects all the claims because the textbook actually does describe how to measure both the rotational and the translational diffusion with NMR. The Supervisory Examiner then confirms final rejection of the patent in July of 1995.
However, Basser et al finally get a Christmas present - on December 26, 1995, their attorney David Rossi makes a phone call to Raymond Mah and convinces him that no one has ever measured the diffusion tensor of water with an NMR system. Mah sends out a note on December 27th allowing all 35 claims of the patent. Rossi appears to have been completely wrong on this, but the patent was then granted without further discussion. The Bergmann patent really does not provide methodology for measuring the diffusion tensor and the primary references that Bergmann cited do not describe it. The Tanner  reference in the Farrar book does describe measuring the translational diffusion tensor, but the examiner did not check these further references.
So when did Peter Basser become aware of the Richards report and the Filler patent if he missed the 1992 Berlin abstract, the patent publication and the 1993 plenary session about diffusion MRI in San Francisco and never heard about this from Denis Le Bihan? This definitely took place in 1999.
In a conflict with the US Patent Office in examination of a later US Patent 5,969,524 from Pierpaoli and Basser, the examiner cited the Filler patent numerous times in rejecting claims filed by the NIH scientists regarding similar subject matter Basser was submitting in this 1997 application. The supervisory US patent examiner Leo Boudreau wrote: “Regarding the above claims, Filler et al teaches a method for assessing diffusion anisotropy in an object; obtaining information signals representing a diffusion tensor for each of a plurality of localized regions in said object (note col. 20 lines 35-67); Information is being obtained to represent a diffusion vector”. Pierpaoli and Basser responded only by incorrectly trying to assert that the Filler patent did not include more than two axes of diffusion - directly in conflict with both the Filler patent and Richards 1992 publication . The Filler patent actually states:
“gradient coils oriented in three planes can be simultaneously activated in various combinations to achieve the effect of an infinite variety of differently oriented gradients .... a technique has been developed for observing diffusional anisotropy, independent of its degree of alignment with any individual gradient axes. This process involves the combination of information from anisotropy measurements obtained along three standard orthogonal axes or using information from multiple fixed axes.”
In the July 31, 1992 priority document by Filler et al (p.21) the utilization for tensor treatment is explicit as is the relationship to known tensor analysis methods for magnetic data which they state:
“The use of vector analysis algorithms of this sort, or involving the treatment or coordinate transformation of MR diffusional anisotropy data with tensors of various rank can improve the generality and flexibility of neurographic imaging. The example described above demonstrates that by the application of tensor and/or vector analysis methods such as algorithms similar to those developed for the evaluation of e.g, magnetic, thermal, or structural anisotropy data, it is possible to greatly improve the flexibility and generality of image techniques for neurological diagnosis.”
Further, Rossi argues on behalf of Pierpaoli and Basser that even if the Filler patent does mention using the tensor for tractography that none of the four inventors on the Filler patent would have known how to use it for that purpose. This assertion did not impress the examiner.
Pierpaoli and Basser were forced to amend the new patent and narrowly limit the claims that were subsequently granted to cover only a theoretical lattice concept that has not proven to have any utility.
Aside from the dispute, the fact that Filler's patent is one of only three documents cited and that it is referenced 32 times in the correspondence makes it quite impossible that Peter Basser was unaware of the Filler patent or its contents as he continued to publish numerous topical and historical articles about the field without referencing that patent or related publications or any of the authors over the following 10 years. His patent was then licensed to GE, Philips and Siemens apparently without these companies being alerted to the competing patent.
Diffusion anisotropy and tractography
The special problems in this task arise because of two ways in which the MRI diffusion tractography problem differs from other diffusion measurement systems. Dating back to the non-computed axial tomogram, continuing on through CT scanning and all MRI work to that point - researchers were concerned with determining how best to determine contrast between one pixel and an adjacent pixel in a two dimensional or tomographic representation. Tractography calls for shifting fully into a three dimensional realm where the structure being determined extends beyond the plane of imaging.
In diffusion MRI, we can tell that diffusion anisotropy in a neural tract is causing water molecules to move preferentially perpendicular to a gradient, but we can't tell which direction along the tract the water molecules are traveling - towards us or away from us. The image intensity is identical for the measurement of diffusion along any axis whether the water is moving in either direction along the tract because it does move in both directions in the neural tract. In general diffusion work this is never a problem. In fact if we are calculating fractional anisotropy (FA) values that essentially give the length of the resultant vector, the answer always comes out the same whether or not we know the true sign (positive or negative) of the direction of the neural tract relative to each axis.
However, for tractography, we have to know the true direction of the tensor relative to the shared Cartesian frame of reference. Filler has outlined elsewhere an anti-symmetric dyadic tensor model that best explains how the additional gradient axis information solves this problem. Basser and LeBihan in their 1993 patent filing (granted in 1996) failed to suggest any method for achieving tractography. Basser has stated in an interview that as of 1994, tractography seemed like science fiction to him. Basser and LeBihan were not able to discover a method to do tractography.
In the 1993 patent application, the Basser group did not propose any means to describe or utilize the angular orientation of the tensor in Cartesian space. Like a number of authors before and after their filing[92, 108, 109] they proposed the use of color maps so that each independent axis of data collection could be assigned a color and the colors then mixed to provide a general view of the directional quality of the data. Even this approach is fairly unproductive if the data is not multiplied by FA information.
Basser has stated that he sought to accomplish tractography by developing a mathematical tensor field model based on the physics of streamlining that would extend his ellipsoid diffusion tensor model to the tractographic level. However he never succeeded in this task. It seems as though this approach could not work since neural tract directions are determined by evolutionary history and neural function and not by any laws of physics.
It is helpful to keep in mind that in the voxel you can imagine a three dimensional set of axes (x, y, and z) but that the center of this Cartesians system is at the center of the voxel rather that any arbitrary corner of the voxel. Now imagine what happens when you have a diffusion measurement of 1 along the X-axis. You will see 0.5 on the -x side and 0.5 on the +x side of the center of the grid. Now suppose you have a measurement of 1 along the Y-axis also - again there will be 0.5 on the negative side and 0.5 on the positive side. We can keep this simpler by coming back with a very low value on our Z- measurement - nearly 0. Even now though, you can imagine four different vectors pointing out from the origin. One midway between the +x and the +y arms, one midway between the +x and the -y arms, and so on - four different vectors organized into two anti-symmetric pairs. How do you decide which is correct? You need to collect data from an additional plane between the axes to learn which is a ghost dyad and which one represents the real Cartesian direction.
In the 1992 patent application Filler et al provided both a simple vector model and tensor model for tractography and actually produced and published the first tractographic brain images. In the final patent they suggest selecting seed points in two remote axial slices and then using an algorithm to tract trace between the regions of interest based on the directional anisotropy data.
In 1999, Susumu Mori (see figure 16)[111, 112] reported success with tractography, in part by retracing the steps outlined in the Filler patent, but also providing further details of the algorithm. He filed a patent that year that was subsequently granted in 2003. In both the Filler et al 1992 and the Mori et al 1999 method,[113, 114] one critical aspect is to select two areas demonstrating a high level of anisotropy and then to allow the algorithm to follow the principal main direction of each voxel to travel from a seed or source point to reach a target point.
There are two methods for tractography that are explained in the Filler application. The first is based on the arctangent function (also applicable using an algorithm called “arctan2” in the version of FORTRAN used for the original work). This function results in the angle of the main vector relative to the selected Cartesian axes. This allowed images analogous to more modern tractography in which an angle parameter was set to determine image intensity. Anisotropic voxels sharing that angle were bright, others were dark, this resulted in a tractographic image that followed long tracts through the brain. Richards also reported that in some pathologies, there seemed to be more disturbance of the angular data than the vector length data.
The second method used true tensor data in a connected voxel algorithm. This type of algorithm - which is a three dimensional elaboration on older “connected pixel” algorithms, provides for a threshold for eliminating voxels of low signal strength under the conditions assessed as well as for decision making about adjacent tracts. It is a seed based method that generates both linear and surface regions based on the input data. In Filler it was applied to the vector length/arctan angular data that describe the orientation of the primary diffusion vector in the voxel to assess connectedness to adjacent voxels. In addition Filler described the use of multiple gradient acquisition hardware that allowed mathematical assembly of an infinite number of differently oriented diffusion gradients run in echo planar sequences to obtain multidimensional tensor data of various ranks.
Jay Tsuruda, a neuroradiologist who was a co-author on Moseley's original 1990 report of anisotropic diffusion and a co-inventor on the Filler patent, joined Richards, Filler and Howe in 1992 after the initial tensor and arctan tractographic work had been done, and started investigating additional issues in tractographic processing. Filler and Tsuruda (along with Grant Hieshima - a neuroradiologist who made several of the major inventions in the directable catheters of interventional radiology) formed a company called NeuroGrafix to develop the technology. In his capacity as chief scientific officer of the company, Tsuruda participated with other scientists in a series of further developments that help refine the tractographic method.[117-120] Members of the inventor group also reported extensively on the development and clinical evaluation of the peripheral nerve tractographic (= neurographic) methodology.[3, 5, 82, 121-126]
One continuing problem with tractographic methods has been that the ellipsoid tensor model of Basser and LeBihan cannot accommodate the biological situation of two neural tracts crossing through each other. This is because in the elipsoid model there can be only one principal eigenvector or main longitudinal axis in a voxel. We can look at the short axes but these are always orthogonal to the main axis and cannot accept any different direction.
In the anti-symmetric dyad model, we can have multiple different dyads arise from multiple measures. If there is one dominant measure in a voxel then any differences or “wobble” between the dyads will reflect the equivalent of the “radial diffusion” from the ellipsoid model - this assesses the degree of isotropy or noise in a voxel. However, if there are two different tracts in the voxel, then strong enough gradients and sufficiently numerous gradient acquisitions in various directions can result in dyadic tensors that group into two different directions reflecting the two different tracts. The HARDI (high angular resolution diffusion imaging) and q-ball methods work in this fashion by abandoning Basser and LeBihan's application of the classical diffusion ellipsoid model. David Tuch and Van Wedeen at the Massachusetts General Hospital were granted a patent for this method in 2006.
The Origins of the Diffusion Anisotropy Imaging Patent
The Neurography and Diffusion Anisotropy Imaging patent was an important step forward for the general problem of treating neural structures in their linear form like bones or blood vessels and accomplished advances in this area on many fronts.
Aaron Filler first proposed an MRI nerve tract imaging project in 1988 at the University of Washington where he was a second year neurosurgery resident and the project went forward under one of the radiology faculty, Jim Nelson. Todd Richards was the lead physicist of the research group. The project envisioned the use of MR contrast agents for delivery by axonal transport with the intention of using a contrast agent to generate linear images of nerves and tracts that would be analogous to the axonal tracers he had used for anatomical studies as a graduate student at Harvard ten years earlier.
In 1990, Filler was working on that project at St. George's Hospital in London using a 4.7 Tesla imager with high slew rate 70 milliTesla/meter gradients (see figure 17) - note that at this time, most clinical imagers had only 10 milliTesla/meter gradients at best and these typically had much lower slew rates than the St. George's research system. A grant application for the MR tract imaging work was rejected by the MR imaging section at NIH but the project was funded by the Neurosciences Research Foundation of Atkinson Morley's Hospital (where Filler worked as a neurosurgical registrar) - the same facility that supported Hounsfield's project to deploy the first CT scanner.
Filler learned of Moseley's report on anisotropy in white matter when Filler gave a visiting presentation of his progress at the Hammersmith Hospital in the early fall of 1990. He then started formulating a plan to try to apply diffusion MRI to the nerve imaging problem. Working with Franklyn Howe, an Oxford trained MR physicist, he noticed that the chemical shift artifact at the very high field had separated the small nerves into two neighboring structures. When diffusion weighting was applied, his finding was similar to the minimal effect in peripheral nerve noticed by Moseley. However, in order to fully distinguish among the water and fat nerve images that partially overlapped in the forearm of a rabbit under anesthesia in the high field high gradient magnet, he added chemical shift selection fat suppression to the diffusion sequence and this yielded a remarkably large increase in apparent anisotropy in the water images of the nerve - quite aside from removing the fat signals from the image. This revealed that the nerve water included both isotropic (or slow diffusing) and anisotropic (or fast diffusing) components, but that the chemical shift selective pulse removed most of the isotropic water from the image because the isotropic water had a shorter T2.
The result was a pure nerve image with no use of contrast agents. He traced a series of images onto acetates and when the nerves in the series of slices were stacked up, they clearly revealed the three dimensional branching pattern of the major nerves of the forearm. The problem was that the nerve images would only be bright when the nerves were directly parallel to the gradient so image intensities dropped out and even disappeared as the nerves curved out of plane. Filler and Howe quickly discarded a three axis solution because of the bipolarity of diffusion and identified the solution as requiring a multiple gradient acquisition with tensor analysis. This was a solution that was apparent because Filler's work at that point included chemical work on manipulating the spinel crystal anisotropy of mixed ferrites he was working with for contrast agents. This was another example - as with Michael Moseley's background in smectic liquid crystals - where a background in the anisotropic diffusion science of crystals resulted in insights into water diffusion in images of neural tracts.
Yet another interesting cross pollination arose from Filler's PhD research in biological anthropology at Harvard. His 1986 PhD thesis dealt with the evolution of Miocene hominoids.[130-133] In particular there was great interest in accurately dating a Miocene vertebra from the Moroto site in Uganda. A key aspect of dating the site utilized studies of paleomagnetism. A similar issue arose with Miocene hominoid fossils from the Siwaliks in Pakistan. Filler's thesis adviser was David Pilbeam - who later served as Dean of Harvard College - and Pilbeam had played an important role in fostering the development of methodology in paleomagnetism. Paleomagnetic structure is assessed by making six differently oriented magnetic remanence measurements around a sample and then using a tensor ellipsoid calculation to determine the eigenvalues and eigenvectors[135-138] - almost exactly the method for diffusion tensor MRI. A 1990 summary article by two of Pilbeam's associates provides full details of the method and the mathematics. A significant portion of the initial theoretical work in paleomagnetism was done by Jelinek who concludes his article by stating: “We expect that this method will also be useful in other fields in which symmetric tensors of the 2nd order are employed.” Other more general works on the relevant tensor methodology are also available.[139, 140]
There is nonetheless an interesting intellectual, scientific and technological mystery about the whole series of events of the development of diffusion tensor imaging that seems to go to the heart of the way that humans advance their technologies. What is striking is that although many brilliant researchers discussed the diffusion tensor in NMR and MRI, it is clear that it was not really being calculated before 1992, even though there were many sources available to explain exactly how to go about it. In part, this appears to have occurred because the tensor is being used as a symbolic concept that states more or less that we all know the appropriate formalism to apply. However, in nearly all situations in NMR, all that we need to know is the orientation of the principal axis of the tensor since that will allow the measurement of the anisotropic diffusion coefficient. If you can find the orientation by manually rotating the sample on a turntable until you see the maximum output, then why bother going through elaborate data collection and calculation steps that once challenged Albert Einstein?
From a philosophical point of view, the history of diffusion tensor imaging shows that in the process of invention, in response to a perception of an unmet technological need, we must disrupt our symbolic understanding of the elements of a problem in order to see its components in their fundamental state then reassemble the elements into novel and unpredicted new relationships and outputs.
Functional MRI (fMRI)
Earlier in this paper, the use of the spin echo to eliminate the “T2*” effects of local magnetic field inhomogeneities was discussed. Functional MRI (fMRI) is based on trying to enhance the impact of T2* effects that result from local bloodflow. Louis Sokoloff had shown that in the neuroscience lab, radiolabeled (carbon 14) deoxyglucose (FDG) could be used to track how much brain metabolism was taking place in various regions. With the tracer in blood, an experimental animal's brain would draw glucose into those regions with higher energy consumption. The synthetic glucose analog molecule would block the normal glucose breakdown and accumulate inside the cell - accumulating larger amounts in more active cells. Then when the animal was sacrificed and the brain was sectioned, the radiolabel would cause increased exposure of X-ray film at the locations with the most retained tracer.
Sokoloff and his colleagues then made FDG with fluorine-18 - a positron emitter. David Kuhl - who had worked on both radio-isotope scanning and an early CT scanner design - together with Michael Phelps (all at the University of Pennsylvania) had made good progress with a positron emission tomography scanner. Working together, Sokoloff, Kuhl, Phelps and colleagues then used 18-FDG and an early PET scanner to observe changes in regional metabolism in the living human brain.
Raichle and colleagues had been using simple detector arrays to monitor regional cerebral blood flow in humans with oxygen-15 (positron emitting) labeled water. This group also progressed to the use of PET scanning, deploying a variety of tracers including Carbon-11 labelled glucose (to try to see a more normal glucose metabolism relative to flourodeoxyglucose). A tremendous amount was learned about the physiology of cerebral metabolism by deploying these various techniques. However, they all required an intravenous injection of a powerful radiation source - something that seems appropriate for assessing the potential growth rate of a patient's brain tumor, but not for routine psychology experiments. Further, the spatial resolution of PET limited the degree of detail possible for these functional studies.
Belliveau and associates at Massachusettes General Hospital showed that MRI contrast agents would distribute differentially based on blood flow and that - at the time scale of MRI - it was possible to show relative increase in contrast agent flow in areas of the brain that were most active. The initial clinical excitement was for the possibility of having a patient engage in a physical movement and using these functional images to help identify the motor strip of the brain's cortex.
However, unknown to Belliveau and the awestruck reviewers at Science, Seiji Ogawa (see figure 18) at AT&T's Bell Labs had already achieved a far more subtle and powerful solution. The effects of de-oxygenated blood are different from the effects of oxygenated blood. Oxygenated hemoglobin is diamagnetic - no external magnetic field, but deoxygenated blood is paramagnetic - it does have an external magnetic field effect. The process of deoxygenation - if it occurred in an area of increased brain activity - could mark that location by causing increased T2* effects. The general class of imaging techniques used are called BOLD for “blood oxygen level dependent” imaging.
Brain activity does increase blood flow to a brain region, but the level of control is not very fine in scale. If one small area has increased activity and increased demand, then a region that may be ten to fifty times larger may see the increased flow. However, although the active area will deoxygenate the blood more rapidly than the less active areas the blood flow response overcompensates.
Gusnard and Raichle pointed out that background oxygen extraction fraction (OEF) rather than oxygen consumption per se would be the best measure because it is relatively uniform across the brain at rest. Because of the overcompensation of flow in response to activity, the OEF actually decreases in areas of increased activity. With this information in hand and with appropriate pulse sequences selected, even very fine scale patterns of brain activation could be reliably monitored. An extra bonus was the finding that time scale of the changes was shorter when assessed in this way.
These changes have led fMRI researchers to deploy very high resolution systems that can differentiate progressively more precise patterns and locations of activity. The analysis of these activations has progressed both toward the particular - identifying precise regions of function along a cortical gyrus, and also toward the level of organization of higher level patterns - assessing patterns of coactivation between limbic, temporal, frontal and parietal functional centers.
Debra Gusnard (see figure 18), a neuroradiologist trained at the University of Chicago and University of Pennsylvania after studying at the Sorbonne (University of Paris) chose to do a second residency in psychiatry while doing fMRI research at the Mallinckrodt Institute of Washington University in St. Louis. Double boarding in neuroradiology and psychiatry would have been difficult to predict a few years ago, but Gusnard's work has shown how compelling this may prove to be in the future. Working with Marcus Raichle, she authored or co-authored several widely cited major papers reporting important advances in the understanding of the baseline functioning of the human brain as well as establishing the OEF (oxygen extraction factor) as a key paradigm for quantitation and analysis of fMRI data.[147-151]
In addition however, Gusnard has helped launch a fascinating new field in which fMRI is deployed to gain biological insight into elements of thought, perception and consciousness. She has pointed out that although traditional neuropsychology has generally considered the concept of “self” as non-biological, the baseline function concept of fMRI provides an alternative explanation. By monitoring the degree of function in coordinated regions of brain when the individual has no external stimuli, the functional properties and components of “self” become subject to study.[10, 147] Gusnard also points out that other fundamental aspects of consciousness - such as attention, self reflection, motivation, and the temporal sequencing of thought are becoming increasingly susceptible to study on a biological basis. This helps provide a substantive methodological basis to the widely anticipated possibility that not only the functioning of the mind but the pathological variations from normal function will be progressively unraveled by future progress in fMRI.
Although tremendous strides have been made in fMRI using the BOLD technique, the field of fMRI has started to undergo another revolutionizing transformation due to methodological improvements in diffusion imaging. In 2001, Le Bihan and colleagues noticed that the isotropic diffusion measurement in grey matter increased with functional activation. Recently advances in signal to noise performance of scanners have led to the finding that diffusion methods can be used to measure functional activation. This measure is entirely different from the oxygen consumption model that dates back to the laboratory autoradiography studies. It appears to be due to swelling of cells associated with their neural activation. The diffusion effect (DfMRI) starts abruptly within 1 second and then resolves before the BOLD changes even start to appear. Onset and resolution is 2-3 seconds for DfMRI and about 9-10 seconds for BOLD studies, so the time resolution is much better using diffusion. In addition, the spatial resolution of the changes appears to be more precise.
Diffusion methods detect a fast diffusing phase and a slow diffusing phase. The relative amount of water in the slow diffusing phase (restricted diffusion) increases with brain activation. The actual cellular and biophysical basis for this remains unclear. It is also unclear whether the already low anisotropy of the grey matter changes as well. [85-87, 152] A similar cellular swelling phenomenon seems to affect the axons of activated neurons as well - this change is proving to be observable with a DTI paradigm that may be termed fDTI.
Overall, the competitive arenas of the academic, intellectual property, and corporate aspects of these historical developments appear to have acted to spur on the advance of technology. It is certainly clear in this area that patents must be considered along with academic publications if we want to clearly understand the historical sequence of ideas and innovations.
Medical imaging continues to be an exciting focus that draws in the most complex aspects of physics, mathematics, computers and neuroscience. Neurosurgeons must remain closely engaged with this process - recognizing where critical clinical needs are not being met by existing technology while striving to find insight into potential solutions. In this way, further rounds of advancement and insight will best serve the practitioners.
Ultimately, a medical image is an extension of the physical exam, allowing the surgeon to probe and examine the patient. As imaging methodology draws more subtle and complex functional capability into the diagnostic arena, the range of problems that will be available for neurosurgeons to try to treat will certainly continue to grow larger as well (see figure 19).
I thank H. Richard Winn, B. Anthony Bell, John R. Griffiths, Franklyn A. Howe, Todd L. Richards, Terrence W. Deacon, Jodean Peterson, and Shirlee B. Jackson for their assistance, contributions, and inspiration in the course of this effort. The original introduction and guidance for the tractographic concept as it arose in axonal transport work was through two great innovators and teachers, Walle J. H. Nauta and Richard L. Sidman.