|
|
||||||||
Original Research Communication |
1 From the Childrens Hospital of New York (MH); the Departments of Pediatrics (SMA and PC) and Medicine (DK) and the Body Composition Unit (MH, JW, and RNP), St LukesRoosevelt Hospital, Columbia University, New York; Westat, Rockville, MD (JB); and the National Institute of Child Health and Human Development, Bethesda, MD (JM).
2 Supported by grant no. DK37352 from the NIH, grant no. 50706 from the Elizabeth Glaser Pediatric AIDS Foundation, and contract no. NO1-HD-3-3162 from the National Institute of Child Health and Human Development. 3 Reprints not available. Address correspondence to M Horlick, Body Composition Unit, Plant Basement, St LukesRoosevelt Hospital Center, 1111 Amsterdam Avenue, New York, NY 10025. E-mail: mnh1{at}columbia.edu.
| ABSTRACT |
|---|
|
|
|---|
Objective: The objectives were to evaluate the performance of 13 published pediatric BIA-based predictive equations for total body water (TBW) and fat-free mass (FFM) and to refit the best-performing models.
Design: We used TBW by deuterium dilution, FFM by dual-energy X-ray absorptiometry, and BIA-derived variables to evaluate BIA models in a cross-sectional study of 1291 pediatric subjects aged 418 y, from several ethnic backgrounds, including 54 children with HIV infection and 627 females. The best-performing models were refitted according to criterion values from this population, cross-validated, and assessed for performance. Additional variables were added to improve the predictive accuracy of the equations.
Results: The correlation between predicted and criterion values was high for all models tested, but bias and precision improved with the refitted models. The 95% limits of agreement between predicted and criterion values were 16% and 11% for TBW and FFM, respectively. Bias was significant for some subgroups, and there was greater loss of precision in specific age groups and pubertal stages. The models with additional variables eliminated bias, but the limits of agreement and the loss of precision persisted.
Conclusion: This study confirms that BIA prediction models may not be appropriate for individual evaluation but are suitable for population studies. Additional variables may be necessary to eliminate bias for specific subgroups.
Key Words: Bioelectrical impedance analysis pediatrics ethnicity puberty HIV infection
| INTRODUCTION |
|---|
|
|
|---|
BIA estimation is based on the assumption that the body is a cylindrical ionic conductor in which the extracellular and intracellular fat-free compartments act as resistors and capacitors, respectively (2, 3). BIA is inexpensive, rapid, and noninvasive, and it has been proposed as an alternative to laboratory-based methods of measuring body composition in children (415). Pediatric prediction equations for total body water (TBW) and fat-free mass (FFM) use BIA-derived variables (eg, resistance in ohms) in various combinations with height, weight, age, and sex, but the applicability of these equations to children at different stages of maturation, of different ethnic backgrounds, or with specific medical problems has been questioned (13, 14, 1619). These models were developed mostly in white subjects, some of whom were healthy and others of whom had medical disorders. The study populations ranged in size from 26 to 246 subjects and in age from 336 mo to 319 y. Pediatric body-composition projects at St LukesRoosevelt Hospital in New York City enrolled 1291 children and adolescents from several ethnic groups. Among these subjects, TBW was measured by deuterium dilution in 1170, FFM was measured by DXA in 1247, and BIA was performed in all 1291.
The purpose of the current study was to use this body-composition database to evaluate 13 published and frequently cited single-frequency BIA models (equations) for predicting TBW or FFM that were either developed or validated in pediatric populations (Table 1
). Although DXA is not considered the gold standard for predicting FFM, as deuterium dilution is for TBW, it has been used extensively in pediatric practice (2024). The specific objectives were to assess the predictive ability of each model according to evaluation criteria, to refit the models that perform best for TBW and FFM, and to cross-validate the refitted models with the use of our study population, which is made up of a large group of healthy children and a smaller group of HIV-infected children.
|
| SUBJECTS AND METHODS |
|---|
|
|
|---|
HIV-infected children (30 females and 24 males, aged 415 y) were recruited from 3 hospital-based, pediatric HIV/AIDS outpatient-treatment programs from 1994 to 1999. HIV infection was diagnosed, and the disease stage was classified according to the criteria of the Centers for Disease Control and Prevention (25).
We used a questionnaire to establish ethnicity; the criterion was consistent Asian, African American, Hispanic, or white background of both parents and all 4 grandparents. Subjects with ethnic backgrounds that did not fit these criteria were classified as "other." The Asian volunteers were of Chinese and Korean background, and the Hispanic subjects were mostly of Dominican origin. Pubertal stage was assessed according to the criteria of Tanner (26) by the pediatric endocrinologist or nurse in younger subjects and by self-assessment in subjects aged
1112 y (27). Coefficients of 0.810.91 have been reported for concordance between physician assessment and self-assessment of pubertal status in subjects aged 918 y (27). In addition, the self-assessed pubertal stage of 94 subjects was in agreement with fasting values for gonadal steroids and gonadotropins in a subset of 100 of the participants in the current study (unpublished data). All subjects were without intercurrent illness at the time of the study visit.
The Institutional Review Board of St LukesRoosevelt Hospital Center approved the study protocol. Consent was obtained from each volunteers parent or guardian; assent was obtained from each volunteer, as well, when appropriate.
Methods of body-composition measurement
All measurements were performed at the Body Composition Unit of St LukesRoosevelt Hospital Center
1 h after subjects had consumed a light meal and while each subject was wearing a hospital gown and foam slippers. Body weight was measured to the nearest 0.1 kg on a balance-beam scale (Weight Tronix, New York), and height was measured to the nearest 0.1 cm on a wall-mounted stadiometer (Holtain, Crosswell, United Kingdom).
TBW (in L) was measured by dilution of deuterium (D2O) given orally at a dose of 0.1 g/kg body wt (28). We collected 13 mL saliva before giving the deuterium. The concentration of tracer was measured in saliva collected 2 h after the administered dose with the use of an infrared system (Miran IFF Fixed Filter Laboratory Analyzer; Foxboro, Norwalk, CT) with a precision of ± 2.1% (2932). The measured TBW was not corrected for nonaqueous exchange.
FFM (in kg) was measured by whole-body DXA scan [Lunar DPX (pediatric software version 3.8G); Lunar Co, Madison, WI (33)]. The scan mode was chosen according to the weight guidelines provided by the manufacturer. The CV for DXA percentage body fat (%BF) in adults in our laboratory is 3.3% (34).
Total body resistance and impedance were measured on each subjects right side with the use of a single-frequency 50-kHz tetrapolar BIA device (model 101 A; RJL, Detroit) with the use of Tracets MP 3000 electrodes containing electrode gel (Lecter Corp, Minnetonka, MN) as described previously (35). Resistance and impedance have a reproducibility of ± 0.31%.
Statistical analysis
The evaluation of models for TBW and FFM involved several steps. First, we evaluated the predictive ability of existing models for subjects in our database according to statistical criteria. Second, we selected the models that performed best on the evaluation criteria and refitted these models by using our data. Third, we estimated error rates for the refitted models by using cross-validation. Fourth, we evaluated the potential improvement of these models by adding new variables: age, weight, height, ethnicity, HIV infection status, sex, and Tanner stage. Fifth, we used cross-validation to estimate error rates for models that included additional variables.
The evaluation criteria included Pearsons correlation coefficients, SEE, percentage prediction errors, loss of precision, and Bland-Altman limits of agreement. SEE for 2 variables (x and y) was computed as
![]() | (1) |
The independent variables from the best prediction equations were refitted to our data. For TBW, ordinary least squares were used to refit the equation; for FFM, nonlinear regression was used to refit the variable coefficients (14). New prediction equations were then generated by use of the original predictor variables with newly fitted model coefficients. Because the same data were used to generate the equations and to evaluate them, simple estimates of error rates based on a comparison of observed and predicted values are not valid, and they underestimate the true error rate. Thus, we used cross-validation to estimate error rates (39). In this method, error rates are estimated by the removal of one observation at a time from the data set and by the prediction for the "removed" case. This is also known as the "one left out" method. Cross-validation gives approximately unbiased estimates of the true error rate (40).
To evaluate potential improvements in the prediction equations by adding new variables, we used both stepwise and best-subset methods, as implemented in the SAS linear regression software, version 8.2 (SAS Institute Inc, Cary, NC). We used the Mallows Cp criterion for selecting the best model (41).
To carry out this procedure for the model for predicting FFM, we used a 3-step process. First, we used the best-subset method to identify the best set of predictor variables for TBW. Second, we used the best-subset method to identify the best predictor variables for hydration FFM (TBW/FFM). Third, we used nonlinear least squares (as implemented in the SAS NLIN procedure; SAS Institute Inc) to determine the best coefficients for the ratio of the 2 expressions. We then used the Bonferroni method to interpret significance levels, because of the large number of comparisons.
| RESULTS |
|---|
|
|
|---|
|
|
The models for FFM were more consistent, with relatively low SEEs for all models. However, the mean percentage error ranged from -7.0% to 10.7%, which indicates considerable bias in some equations.
The loss of precision, shown in the last column of Table 3
, measures the increase in sample size required to compensate for the use of a predicted value rather than the criterion measure in an experimental or epidemiological study (36). These values ranged from 5.2% to 38.0% for TBW, which indicates that an increase in sample size would be modest for equation b of Kushner et al (9) but substantial for the equation of Fjeld et al (5). Most of the models for FFM had minimal loss of precision, ranging from 2.8% to 5.5%.
Selection of "best" models
We chose as the "best" models for predicting TBW and FFM those that maximized correlation while minimizing SEE, mean percentage error (in absolute value), the range between upper and lower limits of agreement for error and percentage error, and loss of precision. By these criteria, equation b of Kushner et al (9) was chosen as the best model for TBW. The equation of Goran et al (14) was chosen as the best model for FFM.
Refining the best models
Using the predictor variables in these equations, we estimated new model coefficients by using least-squares regression based on the study population shown in Table 1
. We obtained the following equations:
![]() | (2) |
![]() | (3) |
The evaluation statistics for the refitted models for TBW and FFM, both for the overall study population and for important demographic subgroups, are given in Tables 4 and 5![]()
. For TBW, the mean error was ± 0.001 L, which indicates virtually no bias with the use of predicted rather than criterion values for the overall study population. The lower and upper limits of agreement were -4.063 to 4.065 L, which means that 95% of persons could be expected to have predicted and criterion values that agree to within 4.065 L. As a percentage of the criterion value, the overall 95% limits of agreement were
16%. The overall minimum and maximum percentage errors for TBW were -0.293 and 0.254 for our data, whereas the SD of the percentage errors was 0.080 (data not shown).
|
|
For FFM, the mean error was 0.007 kg, which also indicates virtually no bias for predicted rather than criterion values for the overall study population. The lower and upper limits of agreement for FFM were -3.949 to 3.962 kg, which means that 95% of subjects could be expected to have predicted and criterion values that agree to within
4.0 kg. As a percentage of the criterion value, the 95% limits of agreement were 11%, which is substantially better than the limits for TBW. The overall minimum and maximum percentage errors for FFM were -0.299 and 0.224 for our data, whereas the SD of the percentage errors was 0.056 (data not shown).
The degree of bias for predicted FFM was negligible for most of the subgroups considered in the study. It was < 2% for all subgroups except those of children aged 48 y and of children infected with HIV; in those subgroups, the percentage error averaged 2.1% and 2.9%, respectively. The significance values in the "P value for error" column show that there was significant bias for several ethnic, age, and Tanner stage subgroups. (Because of the large number of comparisons, P = 0.0366 for the HIV-infected subgroup is not considered significant.)
For both TBW and FFM, the SEE values were relatively constant across subgroups, which indicates that the prediction models work consistently well for the different subgroups. However, there was an increased loss of precision in the age and Tanner stage subgroups. This was due to the smaller variation in criterion values for TBW and FFM within those subgroups (see the SDs in Table 1
); because this variation in the criterion values was smaller, the relative loss of precision with the use of a predicted value was greater.
The differences between predicted and criterion values plotted against the mean of the predicted and criterion values are shown in Figures 1 and 2![]()
. These plots show whether there is any relation between the prediction errors and the criterion values. The presence of such a relation would mean that further model fitting is in order. However, the horizontal slope of the plot indicates there was little or no association between the 2 sets of values: as the means of the predicted and criterion values increased, the errors still had a mean of
0. The rank correlations for these figures were 8.9% (P = 0.002) and 4.4% (P = 0.121), respectively, which confirms the lack of meaningful association between the prediction errors and the size of the criterion value.
|
|
![]() | (4) |
![]() |
![]() |
As an example, to compute the estimated TBW for an HIV-infected African American male who is 16 y old and classified as Tanner stage 5, one would add (0.632 + 0.676 + 0.637 + 0.641) to the result of first 4 terms to get the predicted TBW (in L).
The corresponding model for FFM is the equation
![]() | (5) |
![]() | (6) |
![]() |
![]() |
![]() | (7) |
![]() |
By conventional statistical standards, these models show little improvement over their much simpler counterparts discussed earlier. As before, the r2 values were 0.959 and 0.997, respectively. Similarly, there was little effect on loss of precision, limits of agreement, or extreme percentage errors. However, the full models had significant effects on bias, eliminating systemic bias in all subgroups. This was shown by the reduction in mean percentage error and by the nonsignificant P values for testing whether the prediction errors differed from zero.
| DISCUSSION |
|---|
|
|
|---|
BIA-based predictive equations have 2 potential applications: in the evaluation of specific persons or as outcome variables in population studies. The first application is relevant for individual clinical evaluation, but is also used in health clubs and diet centers. In the second application, predicted values are collected for epidemiologic studies or clinical trials: for example, the National Health and Nutrition Evaluation Survey collects data on both impedance and resistance in persons aged
12 y (42). Predictive equations can be applied to these data to obtain body-composition measures.
The results of this study have implications for both applications. The performance measures of both the simple and full (with additional variables) refitted equations for the whole population and by subgroup will allow investigators to consider the strengths and limitations of BIA-based prediction of body composition.
Individual evaluation
The data presented in this study show wide 95% limits of agreement between predicted and criterion values for TBW and FFM that are fairly consistent across subgroups. Subgroup bias was effectively eliminated when the full models were used, but the limits of agreement did not change.
The limits of agreement observed in this study are not unlike those previously reported for BIA in both pediatric and adult subjects (5, 10, 12, 13, 43, 44). The relevant question is whether these limits of agreement are acceptable in clinical practice or research. Body composition is a valuable measure for the investigation of growth in children and adolescents. The limits of agreement of ± 11% for BIA- predicted FFM indicate that only variation that is > 11% would be clearly identified by BIA, because anything less could be due to prediction error alone. The limits of agreement (± 11%) are equivalent to the SD for FFM in those aged 78 y, for example, which means that only differences > 1 SD would be clearly identified (4547) .
Further comparison of repeated individual measures with criterion values may suggest that a persons measurements have a consistent relation to the criterion values. However, the observed wide limits of agreement, even with refitted full models, support the suggestions of previous investigators that BIA-based predictions are not appropriate for the evaluation of body composition in individual persons (5, 9, 10, 12, 15).
Population studies
In population studies, the limits of agreement are less important than the loss of precision and the degree of bias due to prediction error. The most relevant performance measure for research applications is the loss of precision, which shows the increase in sample size that is necessary to offset the error when predicted rather than criterion measures are used. The more restricted the proposed study group is with regard to age or Tanner stage, the less the between-subject variability and the greater the loss of precision. For example, the sample size must be increased by 14.4% for TBW (7.4% for FFM) for a study of children in Tanner stage 3, but by only 4.6% (2.3% for FFM) for a study of the general population of those aged 418 y.
The bias (mean percentage error) in predicted TBW and FFM from the simple models was negligible overall, was present for specific subgroups, and was effectively eliminated with the full models. The negligible overall bias indicates that simple-model BIA predictions can be used for studies involving the general population of those aged 418 y. However, the bias for some subgroups, notably HIV-infected children, means that the full models are more appropriate for use in studies involving subgroups.
The measure of systemic bias (mean percentage error) will also help in the evaluation of the prediction equations for a specific research project. The bias was low for most subgroups (< 2%), but the degree and direction of each mean percentage error are important if the proposed study compares predicted TBW or FFM between subgroups for which the biases are at variance. Any differences observed between the subgroups may be the result of bias in measurement rather than actual differences. Therefore, even a small bias may increase the likelihood of type 1 statistical errors, particularly in studies of large populations. The use of the full models eliminated the problem of bias for all subgroups. The use of the full models requires the uniform collection of more variables, but the investigators choice of the simple or full model should depend on the degree of bias anticipated from the specific characteristics of the study population or the number of subjects.
Previous reports suggested that population-specific prediction equations might be necessary (1019). The information in this report will allow investigators to determine whether the refitted simple models or the full models are applicable to their study population or whether, in fact, population-specific equations are required for their specific patient group (48).
Limitations
A limitation of this report is the cross-sectional design, which meant that we could not compare the performance of serial BIA predictions with serial criterion values in the same children. Another limitation is that all BIA measurements were obtained by 4 cross-validated technicians in the same laboratory and under consistent conditions. As noted by other investigators, the validity of generalizing these laboratory-based findings to field conditions is undetermined (12, 15). A third limitation is that only one group with a disease (HIV infection) was part of the study population, so that an assessment of the loss of precision or of bias of BIA predictions in children with other, more common pediatric diseases is not available. Finally, as noted earlier, DXA is not the ideal criterion for FFM measurement in children, although its ease of performance in the laboratory allowed a broad range of persons to participate in this project.
Conclusions
This comprehensive study of BIA models for predicting TBW and FFM in a large pediatric population showed 1) an extremely high correlation between predicted and observed criterion values for all models evaluated and 2) a lack of bias and an improved loss of precision for predicted compared with observed values using the refitted models for TBW and FFM for the study population, but 3) persistence of difference between predicted and criterion values even with refitted models. Given the wide CIs in the error of estimating individual results, the clinical applicability of a single BIA measurement is uncertain, but BIA measurement is suitable for populations.
This report provides unique information about the correlation and limits of agreement between BIA and criterion values, about the size of a study population that is needed to compensate for error when a prediction model is used, and about the prediction bias expected for specific ethnic, age, Tanner stage, sex, and disease (HIV infection) subgroups. The equations were largely derived and cross-validated in healthy children, so the precision and accuracy may be lower in the presence of acute or chronic disease. This information is of particular value to pediatricians who must determine whether single-frequency BIA to predict TBW or FFM is applicable to, or informative for, their clinical assessment or investigative needs when more sophisticated or precise body-composition techniques are not available.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. M Silva, S. B Heymsfield, D. Gallagher, J. Albu, X. F Pi-Sunyer, R. N Pierson Jr, J. Wang, S. Heshka, L. B Sardinha, and Z. Wang Evaluation of between-methods agreement of extracellular water measurements in adults and children Am. J. Clinical Nutrition, August 1, 2008; 88(2): 315 - 323. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Zhu, M. K. Kuhlmann, G. A. Kaysen, S. Sarkar, C. Kaitwatcharachai, R. Khilnani, L. Stevens, E. F. Leonard, J. Wang, S. Heymsfield, et al. Segment-specific resistivity improves body fluid volume estimates from bioimpedance spectroscopy in hemodialysis patients J Appl Physiol, February 1, 2006; 100(2): 717 - 724. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-P. St-Onge, Z. Wang, M. Horlick, J. Wang, and S. B. Heymsfield Dual-energy X-ray absorptiometry lean soft tissue hydration: independent contributions of intra- and extracellular water Am J Physiol Endocrinol Metab, November 1, 2004; 287(5): E842 - E847. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |