High-grade glial tumors are one of the most prevalent primary brain tumors in dogs for which the prognosis is grave. In the United Kingdom, the incidence of intracranial neoplasia in dogs is approximately 20 cases/10,000 dogs/y.1 Median survival times for affected dogs range from 2 to 3 months, despite the use of novel and aggressive treatments.2,3 In humans, a high correlation exists between the amount of tumor that remains after resection and rate of tumor recurrence, confirming that advanced imaging can be a valuable prognostic tool.4 The incidence of primary brain tumors is reportedly greater in dogs than in humans, so many subjects are potentially available for preclinical and clinical trials to assess effects of novel experimental treatments in dogs. Additionally, in terms of disease-free interval, progression of neoplasms in dogs occurs at a rate 4.6 times as fast as that in humans, creating the potential to more rapidly advance science.5
Antemortem quantification of lesion size via CT or MRI is commonly performed to assess response to treatment in experimental and clinical studies. Serial gadolinium-enhanced MRI and clinical evaluation are important methods used in human brain tumor research to monitor therapeutic efficacy and progression-free survival times. Historically, the visual metric method involving Macdonald criteria6 has been used, by which the product of the 2 greatest orthogonal diameters is calculated to estimate brain tumor size. However, technological advances have resulted in new software that allows more precise measurements. Such advances include visual tracing of the tumor perimeter (planimetry method)7,8; computer-assisted, threshold-based methods8,9; and fully automated, computer facilitated methods.10–12 Since the original Macdonald criteria were developed, new criteria have been proposed to assess therapeutic responses in both human and veterinary neuro-oncology patients. For brain tumor evaluation in human medicine, these include updated Macdonald criteria,13 response evaluation criteria in solid tumors,14 and response assessment in neuro-oncology.15 Recently, criteria were proposed for assessment of veterinary neuro-oncology patients, referred to as response assessment in veterinary neuro-oncology.16 Criteria for human and veterinary assessment are largely based on diameter measurements in 2 or 3 dimensions.
The original Macdonald criteria, developed in 1990, involved use of the largest cross-sectional diameter in a single plane on an MRI scan, multiplied by the perpendicular diameter in the same plane to estimate the size of brain tumors. These criteria were developed to assess response to various treatments on the basis of MRI findings and clinical signs to designate complete or partial response, stable disease, or progression, but they also provided a foundation for tumor volume assessment. The Macdonald criteria were largely developed in response to the WHO initiative to standardize the reporting of patient responses to various cancer treatments. Although tumor volumes in dogs may be estimated from MRI scans through similar methods, the authors are unaware of any studies conducted to compare and validate different volumetric quantification techniques for glioma in dogs. Veterinary researchers and clinicians are now treating dogs with glioma by surgical resection and other therapeutic modalities and need an accurate and reproducible method to measure response to treatment.
The purpose of the retrospective study reported here was to assess accuracy and reproducibility of 2 protocols to measure tumor volume in dogs by use of MRI data. The optimal method would involve computer software that was user-friendly, widely available to veterinarians, and relatively cost effective. It would also allow an individual with the skill and knowledge of a technologist to calculate tumor volume in a repeatable manner that was operator independent. Our null hypothesis was that there would be no significant differences in interoperator and intraoperator reliability between the planimetry method and visual metric method for estimation of brain tumor volume in dogs.
Materials and Methods
SAMPLE
A total of 66 image series from 22 sets of brain MRI scans were included in the study. Scans pertained to 22 client-owned dogs with histologically confirmed high-grade glioma. Images were acquired by several referral institutions that had MRI units of various magnet strengths, ranging between 1.0 and 3.0 T. All dogs had brain tumors that yielded some degree of contrast enhancement after gadolinium administration at a dose of 1 mL/4.45 kg. All measurements were performed on T1-weighted postgadolinium images in 3 anatomic planes (axial, sagittal, and coronal). All dogs had undergone surgical debulking of solitary primary brain tumors, and tissue specimens from those tumors were histologically evaluated at the University of Minnesota Masonic Cancer Center by pathologists who were board-certified by the American College of Veterinary Pathology.
Volumetric measurements
Two operators, one experienced in reading MRI scans (KHH) and another with no previous experience reading MRI scans (CBT), independently evaluated each MRI scan 4 times, performing 2 measurements for each of the 2 methods (Figure 1). To decrease familiarity or recognition of specific tumors, MRI studies in the different planes were randomized between successive measurement events. The 2 operators were selected on the basis of their availability for the time commitment involved in these evaluations and their general lack of relevant diagnostic expertise, which could be expected in intended future users. Prior to performing volumetric calculations for the study, operators participated in a brief instructional session about the software used to perform the 2 volumetric methods and practice sessions involving each method and sample MRI scans. In addition, a veterinary surgeon that had 10 years of experience specializing in the treatment of dogs with brain tumors made measurements with both methods to serve as a reference standard for comparisons. The imaging softwarea used to obtain measurements was chosen on the basis of affordability, technical simplicity, and availability.
The visual metric method consisted of identifying and marking the longest diameter of the gadolinium-enhanced portion of the tumor mass by use of digital calipers within the designated software on each image slice and subsequently identifying and marking the orthogonal diameter of the mass (Figure 2).3 The 2 diameters were used to calculate an area for every slice in which tumor could be identified by use of the following formula:
where d1 and d2 are orthogonal diameters. The sum of the area measurements, multiplied by slice thickness and intersection gap, was used to determine the volume of each tumor in each of the 3 anatomic planes.
The planimetry method consisted of manually tracing and segmenting the gadolinium-enhanced portion of the tumor mass on each individual slice by use of the designated software. The software automatically calculated the area from the traced perimeter. Consistent with the visual metric method, area measurements were summed and multiplied by the slice thickness and intersection gap and the volume of each tumor was determined in each of the 3 anatomic planes. Volumes calculated on the basis of each method were used for statistical analysis.
Statistical analysis
Agreement indices were calculated to assess reliability of each method within and between operators by use of the following equation8:
where xa and xb represent different sets of measurements.
As the AI approaches 1, the 2 compared volume measurements are more similar and therefore more reliable. Intraoperator reliability was defined as degree of agreement, or similarity, between 2 volume calculations made by the same operator for the same tumor in an individual plane. For calculation of intraoperator AI, xa was the first calculated volume and xb was the second calculated volume by the same operator for the same method for each tumor in 1 plane. Interoperator reliability was defined as the agreement between operators in their volume calculation for the same tumor in an individual plane. Because each operator calculated 2 volumes for each and every tumor with both methods in each plane, the mean of the 2 volumes was used for interoperator comparisons. For the interoperator AI, xa was the mean of the 2 calculated volumes for the first operator and xb was the mean of the 2 calculated volumes for the second operator by use of the same method for the same tumor.
Because of the wide range in tumor sizes, raw data from the volume calculations were logarithmically (base 10) transformed in an attempt to normalize the values. For intraoperator AI comparisons, a mixed ANOVA model was fit by use of statistical software,b with dog included as a block effect. For interoperator AI, a similar mixed ANOVA model was fit but without operator as a variable because the response was not unique to each operator but to the AI between the two.
For each model, terms were tested for significance with type II tests; additionally, least squares means were computed to estimate the mean value for each level and compared via pairwise comparisons. The Tukey honest significant difference test was used for variables with > 2 categories, such as view and method. Values of P < 0.05 were considered significant.
Results
Tumors
All but 1 tumor was classified as grade III or IV on the basis of WHO criteria and consisted of glioblastoma multiforme (n = 11), anaplastic astrocytoma (7), anaplastic oligodendroglioma (3), and primitive neuroectodermal tumor (1). One tumor was classified as a ganglioglioma, which in humans would be classified as a WHO grade I tumor but in dogs has unknown biological behavior.
A total of 528 volume measurements were made from the MRI scans (Figure 3). Tumor volumes ranged from 6.61 to 252.21 mm3, with mean and median volumes of 62.80 mm3 and 42.71 mm3, respectively. Estimated least squares mean (SE) logarithmic values for tumor volume were 1.61 (0.07) and 1.69 (0.07) for each of the 2 inexperienced operators; 1.68 (0.07) and 1.61 (0.07) for the visual metric and planimetry methods, respectively; and 1.66 (0.07), 1.63 (0.07), and 1.66 (0.07) for the axial, coronal, and sagittal planes, respectively. Mean difference in logarithmic volumes between the 2 operators was 0.08, representing a mean difference in tumor size of 18% (P < 0.001). Volumes measured by 1 inexperienced operator were on average smaller, regardless of method or plane used, than those made by the other inexperienced operator. Mean difference in volumes between the 2 methods was 0.07, representing a mean difference in size of 16% (P < 0.001). Volume calculations made with the planimetry method were smaller than those made with the visual metric method, regardless of operator or plane. No significant differences were identified in volume measurements among imaging planes.
Agreement indices used in assessments of intraoperator agreement were graphically plotted (Figure 4). In these assessments, one inexperienced operator had a median AI of 0.94, whereas the other had a median AI of 0.89, indicating that the first operator's measurements were more precise (P < 0.001). The visual metric method had a median AI of 0.89 (mean ± SD, 0.79 ± 0.24), and the planimetry method had a median AI of 0.94 (0.89 ± 0.17), indicating significant (P < 0.001) differences between the 2 methods, with the planimetry method having less variability. There were no significant differences among planes.
Agreement indices used in assessments of interoperator agreement were graphically plotted (Figure 5). In these assessments, the visual metric method had a median AI of 0.77 (mean ± SD, 0.68 ± 0.28) and the planimetry method had a median AI of 0.74 (0.67 ± 0.31). These median values did not differ significantly. Mean and median interoperator AIs were lower than mean and median intraoperator AIs. Median AIs for each plane were 0.63 for axial, 0.73 for coronal, and 0.66 for sagittal. These values differed significantly (P = 0.01) in that greater variability was identified between measurements made by use of MRI scans obtained in the coronal and axial planes.
Discussion
Results of the present study indicated that the planimetry method of measuring brain tumor volume in dogs by use of MRI scans was more reliable than the visual metric method for repeated measurements, particularly when made by the same individual. Comparison of measurements between 2 fairly inexperienced operators revealed that the methods were equivalent in terms of reliability. Mean AIs were consistent with calculations of operator variability in another study.7 Differences in tumor volume calculation between operators that led to greater variability were likely attributable to the inability to precisely define tumor borders.
Both operators in the present study reported that measurements were faster and easier to perform when the planimetry method was used. When the visual metric method was used, it took more time to successfully find the longest diameter, the longest orthogonal diameter, and the angle measurements to ensure the 2 diameters were perpendicular than it did to trace the gadolinium-enhanced perimeter with the planimetry method. Theoretically, tracing the tumor would provide a more reliable estimation of absolute tumor volume because delineating the orthogonal diameter resulted in ellipse-shaped area in each segment, whereas tracing allowed more flexibility in perimeter measurement (Figure 2). This inherent limitation of the original method described by Macdonald to accurately measure irregularly shaped tumors has been described in human medical literature.14 The planimetry method was designed to mitigate this shortcoming through a more adaptable determinant of tumor area per slice. Data reported here for dogs suggested that use of the visual metric method resulted in a considerable overestimation of tumor volume because of failure to account for perimeter voids that did not occur with the planimetry method.
The present study revealed no significant differences in the volumetric data acquired from 3 MRI planes by the same operator. However, significant differences were identified between operators. Volume calculations made from the images in the coronal plane had the lowest variability. The magnitude of variability was significantly different between volumes measured in the coronal versus axial planes. This information would be beneficial when a single plane is used for volume calculations. The original Macdonald criteria involved estimation of tumor size by measuring area from the single MRI slice with the greatest tumor diameter. A study17 involving humans with malignant glioma revealed that calculation of tumor volume rather than area from a single image is more sensitive and specific to changes in tumor size over time.
In the study reported here, all measurements were made on T1-weighted, gadolinium-enhanced images that revealed certain characteristics of high-grade glioma, including a profound mass effect, peritumoral edema, ring enhancement, and central necrotic area.3,18,19 Although the visibility of many high-grade gliomas is enhanced with contrast techniques, which makes it easier to estimate tumor borders, this characteristic is not universal.3 T2-weighted images may be useful for measuring volume of tumors that lack enhancement when fluid-attenuated inversion recovery sequence images are used to differentiate peritumoral edema from tumor during segmentation. This is also important because contrast enhancement can be affected by treatment with antiangiogenic or corticosteroid drugs and with certain radiographic techniques without true changes to the underlying tumor size.15 We used pretreatment MRI scans in the present study to minimize the effects associated with tumor pseudoprogression or hyper- or hypoenhancement associated with other treatments or procedures.
Many factors need to be taken into account when evaluating treatment efficacy in patients with brain tumors. Neurocognitive function, resolution of clinical signs, quality of life, and progression-free survival time are variables currently used to assess response to treatment. New methods need to be developed to provide a more objective assessment than is currently possible. In the present study, only pretreatment tumor volumes were measured; however, serial evaluations are needed to assess patient response to treatment, tumor volume after resection, and progressive changes in tumor volume. The Macdonald criteria and new criteria to assess response to novel treatments by veterinary neuro-oncology patients over time have not been evaluated, and research is warranted into the applicability of these criteria to gliomas in dogs.
A limitation of the present study was that, except for the highly trained operator whose results were used as the reference standard, only 2 inexperienced operators calculated the tumor volumes. Use of more operators with varied experience would minimize study error attributable to interindividual variance. However, use of a homogenous, highly-trained group of operators does not necessarily lead to a decrease in interoperator variance.20 This supposition was supported by the scatter plots in the present study in which values calculated by the 2 inexperienced operators were plotted along with those calculated by the highly trained operator (Figure 3). More training and practice trials may allow each operator to more precisely define or recognize diseased regions on MRI scans. We found that the lowest variation among volume calculations was achieved when the same individual made the measurements, which is an important factor to consider when assessing serially acquired MRI scans.
Inherent limitations of visual metric and planimetry methods include the use of stacked slices for volume determination. Volume calculated by summation of areas measured from stacked slices may vary from the absolute tumor volume. Each slice assumes a slab thickness that may not be consistent throughout the given slice. Additionally, potential exists for over- and underestimation of volume, depending on the extent of inclusion or exclusion of the end cap volumes.4 In the present study, images had been acquired at several institutions with MRI units of differing strength, ranging from 1.0 to 3.0 T. This variability may have affected image evaluation but in the authors’ opinion, appropriately reflected how the technique would be used in clinical practice or research.
Accurate assessment of treatment response is important to tailor treatment to individual patients, facilitate communications about tumor size, perform clinical trials, and allow reliable comparisons of treatment results. The planimetry method used in the present study offers an alternative method to the previously described standard for measuring tumor volume. Continual assessment of and improvement in the reliability of MRI measurements will allow this modality to be used as end point verification for research and will aid in prognosis determination and treatment planning for dogs with brain tumors.
Acknowledgments
Supported in part by the American Humane Association.
The authors thank Dr. Aaron Rendahl for performing the statistical analyses.
ABBREVIATIONS
AI | Agreement index |
WHO | World Health Organization |
Footnotes
OsiriX DICOM software, version 5.6, Pixmeo Sarl, Geneva, Switzerland.
R Core Team (2013), R: a language and environment for statistical computing, Foundation for Statistical Computing, Vienna, Austria.
References
1. Dobson JM, Samuel S, Milstein H, et al. Canine neoplasia in the UK: estimates of incidence rates from a population of insured dogs. J Small Anim Pract 2002; 43: 240–246.
2. Candolfi M, Curtin JF, Nichols WS, et al. Intracranial glioblastoma models in preclinical neuro-oncology: neuropathological characterization and tumor progression. J Neurooncol 2007; 85: 133–148.
3. Galloway RL Jr, Maciunas RJ, Failinger AL, et al. Volumetric measurement of canine gliomas using MRI. Magn Reson Imaging 1990; 8: 161–165.
4. Wood JR, Green SB, Shapiro WR. The prognostic importance of tumor size in malignant gliomas: a computed tomographic scan study by the Brain Tumor Cooperative Group. J Clin Oncol 1988; 6: 338–343.
5. Paoloni M, Khanna C. Translation of new cancer treatments from pet dogs to humans. Nat Rev Cancer 2008; 8: 147–156.
6. Macdonald DR, Cascino TL, Schold, SC Jr, et al. Response criteria for phase II studies of supratentorial malignant glioma. J Clin Oncol 1990; 8: 1277–1280.
7. Joe BN, Fukui MB, Meltzer CC, et al. Brain tumor volume measurement: comparison of manual and semiautomated methods. Radiology 1999; 212: 811–816.
8. Sorensen AG, Patel S, Harmath C, et al. Comparison of diameter and perimeter methods for tumor volume calculation. J Clin Oncol 2001; 19: 551–557.
9. Moonis G, Liu J, Udupa JK, et al. Estimation of tumor volume with fuzzy-connectedness segmentation of MR images. AJNR Am J Neuroradiol 2002; 23: 356–363.
10. Clarke LP, Velthuizen RP, Clark M, et al. MRI measurement of brain tumor response: comparison of visual metric and automatic segmentation. Magn Reson Imaging 1998; 16: 271–279.
11. Fletcher-Heath LM, Hall LO, Goldgof DB, et al. Automatic segmentation of non-enhancing brain tumors in magnetic resonance images. Artif Intell Med 2001; 21: 43–63.
12. Kanaly CW, Mehta AI, Ding D, et al. A novel, reproducible, and objective method for volumetric magnetic resonance imaging assessment of enhancing glioblastoma. J Neurosurg 2014; 121: 536–542.
13. Miller AB, Hoogstraten B, Staquet M, et al. Reporting results of cancer treatment. Cancer 1981; 47: 207–214.
14. Therasse P, Arbuck S, Eisenhauer E, et al. New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J Natl Cancer Inst 2000; 92: 205–216.
15. Vogelbaum M, Jost S, Aghi M, et al. Application of novel response/progression measures for surgically delivered therapies for gliomas: Response Assessment in Neuro-Oncology (RANO) Working Group. Neurosurgery 2012; 70: 234–243.
16. Rossmeisl JH, Garcia PA, Daniel GB, et al. Invited review—neuroimaging response assessment criteria for brain tumors in veterinary patients. Vet Radiol Ultrasound 2014; 55: 115–132.
17. Gladwish A, Koh ES, Hoisak J, et al. Evaluation of early imaging response criteria in glioblastoma multiforme. Radiat Oncol 2011; 6: 121.
18. Lipsitz D, Higgins RJ, Kortz GD, et al. Glioblastoma multiforme: clinical findings, magnetic resonance imaging, and pathology in five dogs. Vet Pathol 2003; 40: 659–669.
19. Kraft SL, Gavin PR, Dehaan C, et al. Retrospective review of 50 canine intracranial tumors evaluated by magnetic resonance imaging. J Vet Intern Med 1997; 11: 218–225.
20. Provenzale JM, Ison C, DeLong D. Bidimensional measurements in brain tumors: assessment of interobserver variability. Am J Radiol 2009; 193: 515–522.