OUP user menu

The Usefulness of Blast Flags on the Sysmex XE-5000 Is Questionable

Heidi Eilertsen MSc, Nina K. Vøllestad PhD, Tor-Arne Hagve MD, PhD
DOI: http://dx.doi.org/10.1309/AJCPDUZVRN5VY9WZ 633-640 First published online: 1 May 2013


Hematology analyzers generate suspect flags that involve microscopic reviews to confirm the presence of pathologic cells. This study investigated the reliability of the blast flag in a side-by-side evaluation of 3 Sysmex XE-5000 instruments (Sysmex, Kobe, Japan). The repeatability of the Q values reported by each instrument for 10 replicates of the same blood samples was low (intraclass correlation coefficient [ICC] values, 0.62–0.74). The reproducibility of the Q values obtained by analyzing 408 samples on all 3 instruments was reasonable (ICC value, 0.85). In addition, a systematic difference was observed among the instruments in the level of reported Q values. With cutoff commonly being 100, the observed reproducibility of the blast flagging among the instruments was evaluated as poor (κ = 0.73). Based on the observed low performances, we question the usefulness of the Q value as a predictor of blasts and whether a blast flag reported by the XE-5000 is sufficient as a criterion for performing a microscopic review.

Key Words:
  • Blast flag
  • Sysmex XE-5000
  • Reliability
  • Q values

Modern automated hematology instruments provide reliable WBC differential counts for results that are within the reference limits and for those with only quantitative abnormalities and no pathologic cells. However, for samples with abnormal cells, such as blasts, a manual differential count is generally indicated.1 The automated counters use flags to indicate the presence of pathologic cells and to notify the user that the differential may not be correct.2 Factors that influence the generation of the flags as well as the review criteria of the laboratory are thus important for the workload and review rate. The usefulness of the flags depends on their diagnostic sensitivity and specificity. Therefore, a careful optimization of cutoffs to balance the risk of nondetection of pathologic cells with laboratory efficiency is needed.3 The factors and algorithms providing flags depend on the underlying technology of the instrument and thus vary from one manufacturer to another. For some instruments, the flags suggest an increased probability of pathologic cells, based on an increased number of cells in definite areas of the scattergrams. For the Sysmex line of instruments (Sysmex, Kobe, Japan), the probability of the presence of blasts is designated by a Q value. The Q value provides information on the degree of positivity or negativity of the flag on a scale of 0 to 300, with increments of 10 arbitrary units. At a given threshold, the blast flag will be triggered and reported by the instrument. The factory setting threshold is preset to an arbitrary value of 100. The user may, however, adjust the threshold to individual clinical needs, with the danger of missing pathologic cells or increasing the number of manual counts resulting in increased turnaround time and increased costs.4,5 Because of an increasing workload and the need to decrease the turnaround time, the trend is further automation. Consequently, the laboratories often use hematology systems that include numerous cell counters on rack-based track systems and the specimens are analyzed randomly on one of the cell counters. Samples that are flagged by the analyzers will be submitted for microscopic review according to the slide review criteria used in the laboratories.

The aim of this work was to study the reliability of the instrument-generated flagging information about the presence of myeloblasts in a side-by side evaluation comparing 3 Sysmex XE-5000 instruments. The inter- and intrainstrument agreement of the Q value, the blast flag reports, and the accuracy of the flag compared with visual microscopy were examined to investigate the usefulness of blast flag reporting.

Materials and Methods

Sample Selection and Study Design

At Akershus University Hospital, Oslo, Norway, the routine hematologic samples are analyzed randomly on 1 of 3 Sysmex XE-5000 instruments (hereafter, XE1, XE2, and XE3) integrated onto a track-based automation system. The instruments were purchased at the same time, calibrated and harmonized by the manufacturer, and used with identical software packages including firmware upgrades. During a period of 5 months, 408 routinely obtained samples were selected and reanalyzed on all 3 instruments. For the purpose of this study, all samples were kept anonymous. The samples were selected based on the suspect flags reported by Sysmex XE-5000 initially. The triggering thresholds for the flags were factory settings and only samples reported with 1 or more of the 4 suspect flags—blasts, immature granulocytes, atypical lymphocytes, and abnormal lymphocytes/lymphoblasts—were included. The Q values for blasts and the blast flag reports from all 3 instruments were used for the reproducibility studies. Diagnostic performance of the blast flag was investigated using the flag reports and the results of the blood smear examination as a reference. To study the intrainstrumental repeatability of the blast flag, 4 positive samples were analyzed 10 times on each of the 3 instruments. The samples were randomly selected from samples with blast flags.

Study Samples and Analytic Conditions

Venous blood samples were collected in Vacuette tubes (Greiner Bio-One, Frickenhausen, Germany) containing dipotassium ethylene diaminetetraacetic acid. The specimens were transported to the laboratory by a pneumatic tube transporting system and processed on the Sysmex XE-5000 instruments using software version 00–04 in “closed mode” within 4 hours after collection. The instruments were continuously involved in the routine workload during the study period and their performance was monitored with an internal quality control system, including control materials from the instrument manufacturer, and several external quality surveillance programs.

Sysmex XE-5000

The Sysmex XE-5000 produces scattergrams to provide a 5-part differential leukocyte count based on an optical principle, including forward and side scatter, side fluorescence, and electric impedance. The software also allows an extended differential count, including an immature granulocyte count (promyelocytes, myelocytes, and metamyelocytes). In addition, the instrument reports WBC-specific flags related to the possible presence of pathologic cells as blasts, immature granulocytes, and variant lymphocytes. Normal leukocytes have different locations in the scattergrams and pathologic cells have locations different from normal cells. Detection of cells in the areas of pathologic cells combined with the use of various algorithms will generate flags.6 The blast flag indicates the possible presence of myeloblasts in the sample. The factory setting threshold for flagging the blast cells was used. Hence, a blast flag was present on the sample report for Q values at or above 100 and absent for Q values below 100.7

Manual Differential Leukocyte Count

For each sample, 2 blood smears were prepared and stained with May-Grünwald-Giemsa using Sysmex SP-1000. A total of 200 cells were counted by 2 highly trained technicians, each counting 100 cells.8 The average blast percentage was calculated for each sample based on the 2 manual differential counts. The presence of a single blast (0.5%) qualified for a true positive smear finding, a criterion supported by the International Consensus Group for Hematology Review.9

Statistical Analysis

Statistical analyses were performed using the SPSS software version 16 (SPSS, Chicago, IL) for Windows (Micro-soft, Redmond, WA) and an online κ calculator.10 A 2-tailed P value of less than .05 was considered statistically significant.

To describe the data, the Q values were summarized as median and interquartile range (25th–75th percentile) before the analysis of concordance. The Friedman test was used to detect differences in the Q values across the 3 instruments and Cochrane Q test was used to assess differences in reporting the blast flag among the instruments.11 Intraclass correlation coefficient (ICC [2.1], 2-way random, single measures) was calculated to quantify the intra- and interinstrument reliability for Q values. κ values were used to determine intra- and interinstrumental reliability for the blast flag.12 The diagnostic performance of the blast flag was tested by a receiver operating characteristic (ROC) curve for all 3 instruments and the McNemar test was used to compare the sensitivity and specificity of the blast flag when the threshold was 100 vs 300.11


The leukocyte counts ranged from leukopenia to severe leukocytosis (500–207,000/μL [0.5–207 × 109/L]) with a median of 9,900/μL (9.9 × 109/L) and mean of 20,300/μL (20.3 × 109/L). The manual differential count showed 0.5% or more blasts in 48 of the 408 samples.

Interinstrumental Variability of Q Values Among the 3 Instruments

Figure 1 shows the distribution of the Q values reported by the 3 instruments. The interquartile ranges of the Q values were 0 to 300 for XE1, 0 to 280 for XE2, and 0 to 250 for XE3; the frequency of the Q values 0 and 300 among the 3 instruments was different. XE1 tended to give higher Q values than XE2 and XE3, and XE2 provided higher Q values than XE3. The median Q value reported by XE1, XE2, and XE3 were 70, 40, and 30, respectively, and a significant difference was found (P < .001, Friedman test). The ICC for the Q values reported by the instruments was 0.85. Image 1 shows screen shots illustrating the discrepant Q values for the 3 instruments.

Interinstrument Variability of the Blast Flag Among the 3 Instruments

With the factory setting threshold for flagging, XE1, XE2, and XE3 reported the blast flag for 191, 164, and 146 samples, respectively. Correspondingly, no flag was reported for 217, 244, and 262 samples. The differences in blast flag reporting among the 3 instruments were statistically significant (P < .001, Cochrane Q test). The κ value for the blast flag was 0.73. The κ values for agreement between pairs of instruments are shown in Table 1. Of the 408 samples tested, 1 or more instruments flagged for blasts in 208 samples. The instruments showed full agreement in 128 samples.

Figure 1

Frequency distribution of Q values for blasts generated by the Sysmex XE-5000 (Sysmex, Kobe, Japan). The same 408 samples were analyzed with 3 instruments: XE1 (A), XE2 (B), and XE3 (C).

Image 1

A representative example of discrepant Q values reported by the Sysmex XE-5000 (Sysmex, Kobe, Japan). Q flag screen shots from XE1 (A), XE2 (B), and XE3 (C) from sample 88. The names of the specific flags are given at the top of the bins. The Q values providing information on the degree of positivity or negativity of a flag are shown below the bins. The dotted lines represent the threshold triggering a flag report.

Intrainstrument Variability of Q Values

Mean, standard deviation, and range of Q values for the replicate measurements of 4 samples per instrument are shown Table 2. The standard deviations varied from 22 to 91. The coefficients of variation varied from 10 to 109, and lower coefficients of variation were associated with high mean values. ICCs for the Q value were 0.68, 0.62, and 0.74 for XE1, XE2, and XE3, respectively.

Intrainstrument Variability of the Blast Flag

The κ values calculated for the blast flag of the 4 replicate samples to assess intrainstrument repeatability were 0.32, 0.15, and 0.65 for the XE1, XE2, and XE3, respectively.

ROC Curve Analysis

The ROC curves are shown in Figure 2. The area under the curve (AUC) for the blast flag vs manual differential counts were 0.68 (95% confidence interval [CI], 0.59–0.76), 0.73 (95% CI, 0.65–0.80), and 0.71 (95% CI, 0.62–0.79) for XE1, XE2, and XE3, respectively (P < .001). No difference in AUC was found among the instruments (P = .112). The optimum flagging threshold was 255 for XE1, 105 for XE2, and 135 for XE3.

View this table:
Table 1
View this table:
Table 2
Figure 2

Receiver operator characteristic curves for the overall performance of the blast flag on the Sysmex XE-5000 (Sysmex, Kobe, Japan) to predict blasts in blood samples. The area under the curve for the blast flag was 0.68 (95% confidence interval [CI], 0.59–0.76) for XE1, 0.73 (95% CI, 0.65–0.80) for XE2, and 0.71 (95% CI, 0.62–0.79) for XE3.

Sensitivity, Specificity, and Predictive Values

A manual review of the 208 samples flagged by at least 1 of the instruments showed the presence of blasts in 37 samples. Eleven samples with 0.5% or more blasts were reported negative (flag absent) by the instruments. Nine of the false-negative samples were from patients with leukopenia. Depending on the flagging threshold (100 or 300), the sensitivity of the blast flag varied from 0.54 to 0.75 for the 3 instruments Table 3. The highest sensitivity was obtained with the Q value cutoff at 100. Correspondingly, the specificity varied from 0.57 to 0.81, with the highest value at the Q value threshold of 300. The decrease in sensitivity and the increase in specificity when the Q value threshold was increased from 100 to 300 were statistically significant (P < .05 and P < .001, respectively).

View this table:
Table 3
View this table:
Table 4

Table 4 shows the predictive values of the blast flag calculated at the various thresholds. The positive predictive value for the blast flag was higher with the Q value threshold at 300 than at 100.


The aim of this study was to evaluate the usefulness of the blast flag on an automated cell counter by investigating the analytic performance of the blast flag. When comparing the report of the blast flag on 3 different XE-5000 instruments, we found a difference of 12% to 31% in the number of blast flag reports among the instruments, indicating that unequal numbers of microscopic smears were performed. This finding was unexpected and hard to explain because the same samples were analyzed at the same time on the 3 instruments to minimize the factors that normally influence comparative evaluations.13,14 In addition, the instruments were under a strict and continuous quality control program. The quality control programs however include only normal cell populations and do not evaluate the instrument’s flagging performance.

Measured Q values from the same instrument with repeated measures from the same blood samples were highly variable, as seen in the ICC values (0.62–0.74). These results indicate that the precision of the individual instrument is too low to provide reliable blast flag reports. The reasons for this lack of concordance are not certain, but might be related to the robustness of the optical information used to produce the scattergrams and algorithms for detecting the presence of blasts. In contrast, the reproducibility of the Q values obtained by the 3 instruments was reasonable, with a higher ICC value (0.85) than for intrainstrument reproducibility. In addition to this somewhat more acceptable reproducibility, a systematic difference was found in the level of reported Q values. With a common cutoff at 100, the observed reproducibility of the blast flag report of the instruments is evaluated as poor, based on the κ value (0.73). It is well known that false-positive flagging causes unnecessary review and false-negative flagging fails to spot the blast cells. Consequently, it is of importance to improve the value of the flagging system by optimizing the analytic performances of the flag alerts.

Gossens et al3 showed that an elevation of the flagging threshold levels could reduce the false-positive rate of the overall flagging performance by 37% followed by a reduction in microscopic reviews by 14%. However, they kept the cutoff value for the blast flag at the factory setting (Q = 100) to avoid creating additional false-negative blast flag. Our data showed that adjusting the flagging threshold from 100 to 300 increased the specificity of the blast flag, and thereby reduced the review rate by 12%. As a consequence, the number of false-negative samples increased by 13% to 19%, which is too much if the goal is to avoid missing cases of blasts. However, as commented in previous studies, the gain in productivity has to be balanced against the clinical need of a slide review.15,16

This study also shows that the diagnostic sensitivity of the blast flag report is low (0.54–0.75). In the ROC curve analysis, the AUC is low (0.68–0.73) as is the positive predictive value (0.18–0.23). Because of this observed low AUC, we question the usefulness of the Q value as a predictor of blasts. It is of current interest to investigate whether the AUC can be improved by combining the blast flag with other flags or parameters as shown by Hoedemakers et al.17 The observed sensitivity of the blast flag is shown to be higher in other studies, varying from 0.78 to 0.91.1820 Likewise, the specificity varied in these studies from 0.83 to 0.95. The observed discrepancy may at least in part be explained by different criteria for including samples. In our study, samples with 1 or more flags were included to reflect the composition of samples that need a follow-up in the routine laboratory. Furthermore, most authors have used the factory setting as a common threshold. An additional difficulty in comparing data from various studies is the interinstrument variability in sample classification. It may thus be of importance to optimize the flagging threshold when studying the performance of flags, as shown by Sireci et al.4 In addition, introduction of software upgrades with new flagging algorithms for the presence of blasts would have a direct impact on the analytic performance.21

Of the 11 false-negative samples, 9 were from patients with leukopenia. Previous studies have shown poor flagging sensitivity for blasts in samples with low leukocyte counts.19,22 Because sensitivity and specificity relate to the discriminative ability, the definition of a true-positive smear has to be taken into consideration when evaluating the accuracy of the blast flag. In this study, the presence of a single blast (0.5%) was considered a true-positive smear. Moreover, examination of blood smears is not assumed to classify cells with unerring accuracy according to the presence or absence of blasts. Manual cell classification is subjective, with low inter- and intraobserver reproducibility, and it is well known that the manual differential count has limited application in low-frequency cell populations.23,24 In practice, it is difficult to determine with absolute certainty a true positive smear, and referring to an imperfect standard introduces biases to the measures of sensitivity and specificity.25 An alternative to manual differential counting that is more accurate is the use of monoclonal antibodies and flow cytometry to identify abnormal cells as blasts.8

To the best of our knowledge, this is the first report of repeatability and reproducibility of blast flags in samples reporting blasts on the initial testing with the Sysmex XE-5000 instrument. Our study indicates that the analytic quality, both inter- and intrainstrument variation, is questionable. Also the clinical usability, as evaluated by sensitivity, specificity, and predictive value, under the present conditions, is not a good criterion for microscopic smear review.

In conclusion, it may be questioned whether a blast flag reported by the XE-5000 is sufficient as a criterion for performing a microscopic review. Its performance is inadequate because of the low reproducibility in repeated measurements on the same sample in the same instrument. Furthermore, the lack of agreement among harmonized instruments in reporting blast flags is unacceptable. In addition, the ability to predict the presence of blast in smears is low, with a high rate of false-positive results.


Upon completion of this activity you will be able to:

  • discuss how the flagging performance for blasts affects the number of manual film reviews.

  • define Q value and discuss the potential advantages of using an optimized cutoff for the blast flag.

  • define interinstrumental variability and discuss how it affects the usefulness of the blast flag reporting.

The ASCP is accredited by the Accreditation Council for Continuing Medical Education to provide continuing medical education for physicians. The ASCP designates this journal-based CME activity for a maximum of 1 AMA PRA Category 1 Credit™ per article. Physicians should claim only the credit commensurate with the extent of their participation in the activity. This activity qualifies as an American Board of Pathology Maintenance of Certification Part II Self-Assessment Module.

The authors of this article and the planning committee members and staff have no relevant financial relationships with commercial interests to disclose.

Questions appear on p 692. Exam is located at www.ascp.org/ajcpcme.


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
View Abstract