Log in / Register
Home arrow Computer Science arrow Social Computing, Behavioral-Cultural Modeling and Prediction
< Prev   CONTENTS   Next >

3 Analysis 1: Comparison of Methods

Here we expand the idea of randomizing scan order to include the subsystem-

level (e.g., for a 3-subsystem system, instead of following a fixed order cycling through the subsystem S1 → S2 → S3 → S1, use a random order such as S1 → S3 → S1 → S2 and so on). This is in addition to implementing random scan at the variable level within a subsystem. Four different methods were compared: (1) Random scan at the subsystem level and fixed scan at the variable level (RF); (2) Random scan at the subsystem level and random scan at the variable level (RR); (3) Random scan at the subsystem level but with the restriction that each subsystem must appear once and only once in a cycle (e.g., S1 → S3 → S2 for a cycle at the subsystem level), and fixed scan at the variable level (RF*); and (4) Random scan at the subsystem level in which each subsystem must appear once and only once in a cycle, and random scan at the variable level (RR*). Because using fixed scan algorithm at the both the subsystem and the variable level often failed to converge or converged to apparently highly biased solution, we did not include it into our comparison study.

We used a data set from a multi-center epidemiological study the Health Aging and Body Composition Study (Health ABC) to anchor the comparisons. The Health ABC is a cohort study focused on understanding the relationship between body composition, weight-related health conditions, and incident functional limitation. This rich data set now contains more than 10 years of follow-up since its inception in 1997/98, when baseline data were collected. The study sample consisted of 3,075 well-functioning black and white men and women aged 68 to 80 years, recruited from two field centers in the Eastern part of the U.S. For the purpose of the current study, selected variables were used to represent four subsystems. Fig. 3 shows the four subsystems and the set of common variables across the subsystems Disease status (S1), Mental status (S2), Self-efficacy in function (S3), and Physical performance (S4), whereas the common set of variables include demographic information (gender, age, and race) and anthropometric measure (BMI). The L-divergence and the L1 discrepancy (e.g., see [7]) were used to measure error, which is defined as a function of the difference between the derived conditional distribution and the actual conditional distribution within each subsystem S1-S4. Errors were normalized by averaging the total cumulative errors across the number of cells used in each subsystem so that values across subsystems could be fairly compared. For each method, a total of 300 million Gibbs samples were used to estimate the cell probabilities.

Fig. 3. Four subsystems with a set of common variables in HABC

3.1 Results for Analysis 1

Table 1 shows the errors (smaller is better) across the 4 methods. In general, the random scan methods are superior to their respective fixed scan counterparts. For example, RR, the method that uses random scans at both the subsystem and variable level has, on average, a reduction of 38% on L-divergence and 20% on L1 divergence. We were surprised by the improvement in RR* over RR, which appears to be consistent across the subsystems, even though the magnitude is not as large as the improvement seen in comparing RR and RF, or RR* and RF*. Simply enforcing the requirement that each cycle walks through all subsystems appear to help overall performance, if not the rate of convergence.

Table 1. Normalized error rates (×103 ) of 4 pseudo-Gibbs-sampler methods

Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
Business & Finance
Computer Science
Language & Literature
Political science