Tuesday, February 06, 2018

Unintended consequences of machine learning in medicine? [version 1; referees: 2 approved]
Laura McDonald1, Sreeram V. Ramagopalan https://orcid.org/0000-0002-4766-51601, Andrew P. Cox2, Mustafa Oguz2
Author details
Grant information
  This article is included in the Machine learning: life sciences collection .
Abstract
Machine learning (ML) has the potential to significantly aid medical practice. However, a recent article highlighted some negative consequences that may arise from using ML decision support in medicine. We argue here that whilst the concerns raised by the authors may be appropriate, they are not specific to ML, and thus the article may lead to an adverse perception about this technique in particular. Whilst ML is not without its limitations like any methodology, a balanced view is needed in order to not hamper its use in potentially enabling better patient care.
Corresponding author: Laura McDonald
How to cite: McDonald L, Ramagopalan SV, Cox AP and Oguz M. Unintended consequences of machine learning in medicine? [version 1; referees: 2 approved]. F1000Research 2017, 6:1707 (doi: 10.12688/f1000research.12693.1)
Copyright:  © 2017 McDonald L et al. This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Competing interests: LM and SR are employees of Bristol-Myers Squibb Company. AC and MO are employees of Evidera Inc.
First published: 19 Sep 2017, 6:1707 (doi: 10.12688/f1000research.12693.1)
Latest published: 19 Sep 2017, 6:1707 (doi: 10.12688/f1000research.12693.1)
There is significant interest in the use of machine learning (ML) in medicine. ML techniques can ‘learn’ from the vast amount of healthcare data currently available, in order to assist clinical decision making. However, a recent article1 highlighted a number of consequences that may occur with increased ML use in healthcare, including physician deskilling, and that the approach is a ‘black box’ and unable to use contextual information during analysis.

Whilst we agree that Cabitza et al’s concerns are justified1, we believe that a more balanced discussion could have been provided with regards to ML-based decision support systems (ML-DSS). As it stands, an impression is given that ML is flawed, rather than the issue being the way in which it is applied. The concerns raised are generally applicable to many analytical approaches, and reflect poor study design and/or a lack of analytical rigour than the particular technique being used.

The authors cite two examples to claim that ML-DSS could potentially reduce physician diagnostic accuracy. The mammogram example2 shows reduction in sensitivity for 6 of the most discriminating of 50 radiologists. However, the mammogram ML-DSS referred to is old2, and it is not clear how the underlying model was trained and evaluated. The model may perform well for some types of cancer, but not as well for others as a result of the training data. Indeed updates have been shown to increase detection sensitivity3. ML models can be refined by providing more data and results need to be critically appraised in this context. Additionally, no mention is made of the possible benefits of ML-DSS for less experienced staff. In the mammogram example, an improvement in sensitivity for 44 out of 50 radiologists was seen for easier to detect cancers. There was also an increased overall diagnostic accuracy when using ML-DSS in the electrocardiogram study4. Accuracy loss for experienced readers when using ML-DSS is valid, but more reflective of training needed and not an outcome specific to ML-DSS. A knowledgeable doctor may have no need for an ML-DSS, but the tool could greatly assist less experienced staff.

Cabitza et al. also argue that the confounding caused by asthma in the outcome of patients with pneumonia would have not been observed in a neural network model. There are, however, methods to obtain the feature importance and the direction of the relationship between predictor variables and outcome in neural networks5. Further, some ML approaches, such as random forest, are more transparent than others and ML can easily be coupled with clinical expertise to develop risk models that have their benefits over traditional statistical modelling6.

The issues highlighted by Cabitza et al. are more concerned with the studies themselves rather than an intrinsic flaw in ML methodology. To fully leverage ML or any other approach, users must have a good understanding of the caveats. In summary, we agree that ML-based approaches are not without their limitations, but the growing application of ML in healthcare has the potential to significantly aid physicians, especially in increasingly resource constrained environments. Informed, appropriate use of ML-DSS could, therefore, enable better patient care.

Competing interests

LM and SR are employees of Bristol-Myers Squibb Company. AC and MO are employees of Evidera Inc.

Grant information
The author(s) declared that no grants were involved in supporting this work.

References
1.  Cabitza F, Rasoini R, Gensini GF: Unintended Consequences of Machine Learning in Medicine. JAMA. 2017; 318(6): 517–518. PubMed Abstract | Publisher Full Text
2.  Povyakalo AA, Alberdi E, Strigini L, et al.: How to discriminate between computer-aided and computer-hindered decisions: a case study in mammography. Med Decis Making. 2013; 33(1): 98–107. PubMed Abstract | Publisher Full Text
3.  Kim SJ, Moon WK, Kim SY, et al.: Comparison of two software versions of a commercially available computer-aided detection (CAD) system for detecting breast cancer. Acta Radiol. 2010; 51(5): 482–490. PubMed Abstract | Publisher Full Text
4.  Tsai TL, Fridsma DB, Gatti G: Computer decision support as a source of interpretation error: the case of electrocardiograms. J Am Med Inform Assoc. 2003; 10(5): 478–483. PubMed Abstract | Publisher Full Text | Free Full Text
5.  Olden J, Jackson DA: Illuminating the "black box": a randomization approach for understanding variable contributions in artificial neural networks. Ecol Model. 2002; 154(1–2): 135–150. Publisher Full Text
6.  Ayer T, Chhatwal J, Alagoz O, et al.: Informatics in radiology: comparison of logistic regression and artificial neural network models in breast cancer risk estimation. Radiographics. 2010; 30(1): 13–22. PubMed Abstract | Publisher Full Text | Free Full Text
Open Peer Review

Referee Report 23 Nov 2017
Hugo Schnack, Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands
Zimbo Boudewijns, Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands
 Approved
Machine learning (ML) methods are currently being applied in a wide range of fields. Theoretically, the ability to extract meaningful relations from large datasets holds a great promise for health care and could potentially offer new, unexpected insights into disease ... Continue reading

Referee Report 30 Oct 2017
Arturo Gonzalez-Izquierdo, Institute of Health Informatics, University College London, London, UK
Maria Pikoula, Institute of Health Informatics, University College London, London, UK
Spiros Denaxas, Institute of Health Informatics, University College London, London, UK
 Approved
The publication of this letter is both important and timely given the increased interest that statistical learning approaches applied to healthcare data are receiving.




No comments: