ACVR: Comparison of AI to the Veterinary Radiologist : simplebooklet.com

Received: 20 May 2021 Revised: 8 December 2021 Accepted: 8 December 2021DOI: 10.1111/vru.13062ORIGINAL INVESTIGATIONComparison of artificial intelligence to the veterinaryradiologist’s diagnosis of canine cardiogenic pulmonary edemaEunbee Kim1Anthony J. Fischetti1Pratheev Sreetharan2Joel G. Weltman3Philip R. Fox41Department of Diagnostic Imaging, TheAnimal Medical Center, New York, New York,USA2Vetology Innovations, LLC., San Diego,California, USA3Department of Emergency and Critical Care,The Animal Medical Center, New York, NewYork, USA4Department of Cardiology, The AnimalMedical Center, New York, New York, USACorrespondenceAnthony Fischetti, DVM, MS, DACVR, TheAnimal Medical Center – Diagnostic Imaging,510 E 62nd Street, New York, NY 10065, USA.Email: anthony.fischetti@amcny.orgAbstractApplication of artificial intelligence (AI) to improve clinical diagnosis is a burgeon-ing field in human and veterinary medicine. The objective of this prospective, diag-nostic accuracy study was to determine the accuracy, sensitivity, and specificity ofan AI-based software for diagnosing canine cardiogenic pulmonary edema from tho-racic radiographs, using an American College of Veterinary Radiology-certified veteri-nary radiologist’s interpretation as the reference standard. Five hundred consecutivecanine thoracic radiographs made after-hours by a veterinary Emergency Departmentwere retrieved. A total of 481 of 500 cases were technically analyzable. Based on theradiologist’s assessment, 46 (10.4%) of these 481 dogs were diagnosed with cardio-genic pulmonary edema (CPE+). Of these cases, the AI software designated 42 of 46 asCPE+ and four of 46 as cardiogenic pulmonary edema negative (CPE−). Accuracy, sen-sitivity, and specificity of the AI-based software compared to radiologist diagnosis were92.3%, 91.3%, and 92.4%, respectively (positive predictive value, 56%; negative pre-dictive value, 99%). Findings supported using AI software screening for thoracic radio-graphs of dogs with suspected cardiogenic pulmonary edema to assist with short-termdecision-making when a radiologist is unavailable.KEYWORDSArtificial intelligence, congestive heart failure, convolutional neural network, myxomatous mitralvalve disease, thoracic radiograph1 INTRODUCTIONHeart disease occurs in approximately 10% of dogs visiting primarycare veterinary clinics.1The most common acquired cardiac disor-der, myxomatous mitral valve degeneration (MMVD), affects approx-Abbreviations: ACVR, American College of Veterinary Radiology; AI, artificial intelligence;CHF, congestive heart failure; CI, confidence interval; CNN, convolutional neural network;CPE, cardiogenic pulmonary edema negative; CPE+, cardiogenic pulmonary edema positive;ECVR, European College of Veterinary Radiology; ML, machine learning; MMVD, myxomatousmitral valve disease; NPV, negative predictive value; OR, odds ratio; PPV, positive predictivevalue; RGT, report generation time.Previous presentation or publication disclosure: The content of this paper has not been pre-viously published, in full or in abstract form, nor has it been presented at a scientific meeting orcongress.imately up to 75% of these cases.1,2MMVD frequently results incongestive heart failure (CHF) characterized by acute cardiogenicpulmonary edema (left-sided CHF; CPE) and respiratory distress—arapidly progressive medical emergency that is fatal if misdiagnosed orif treatment is delayed. Thoracic radiography has long been consideredthe gold standard for diagnosing CPE.3,4Thoracic radiography remainsa widely available, non-invasive, rapid test to determine the presenceof CPE, especially when combined with the medical history and clinicalpresentation. Nevertheless, diagnosing CPE from thoracic radiographscan be challenging when patients display atypical radiographic featuresor subtle findings. While Doppler echocardiography and point of carelung ultrasonography can aid in the diagnosis of canine and feline CPE,Vet Radiol Ultrasound. 2022;1–7. © 2022 American College of Veterinary Radiology 1wileyonlinelibrary.com/journal/vru

Page 2

2 KIM ET AL.these techniques require specific equipment, specialized training, andare user dependent.4,5Artificial intelligence (AI) is a broad term used to describe com-puter algorithms that perform tasks to mimic cognitive function, suchas learning or problem solving. Within artificial intelligence are morespecific fields including machine learning (ML), representation learn-ing, and deep learning. Machine learning refers to AI that uses obser-vational data without specific programming by narrowing the algo-rithm’s parameters to optimize the relationship between the input andoutput.6–8In representation learning, the computer algorithm learnsfeatures to facilitate classification with performance, generally improv-ing as more data are added.6–8Deep learning is a subfield utilizingmultiple layers of algorithms to analyze data. To train the deep learn-ing systems, hundreds of thousands of images are presented to thecomputer for the software to then guess, compare, and re-calibrate toimprove its accuracy compared to the ground truth. Ground truth is aterm that defines what the computer should consider as the correctanswer.8Growing interest in AI-based imaging software has been stimu-lated by the expansion of telemedicine, greater volumes of diagnos-tic images, and demand for more efficient report generation. In humanmedicine, AI systems have been used effectively to detect various med-ical conditions by multiple diagnostic imaging modalities.9–11Recently,deep learning has been applied to veterinary medicine for the detec-tion of lesions from feline and canine thoracic radiographs and the pres-ence of left atrial enlargement in dogs.12–14Furthermore, the Ameri-can College of Veterinary Radiology (ACVR) and European College ofVeterinary Radiology (ECVDI) created the AI Education and Develop-ment Committee to evaluate the current and future impact of AI withinthe speciality.15The role of AI in veterinary radiology is rapidly grow-ing and evolving but the accuracy of the various AI systems and clinicalapplications remains undetermined.A commercially available AI-based software (Vetology Innovations,San Diego, CA, USA) has been developed to review canine thoracicradiographs and generate diagnostic r eports. The software relies onconvolutional neural networks (CNNs), a type of deep learning net-work that uses layers of algorithms. Convolutional neural networkscontain connected nodes designed to mimic neurons in the brain andare commonly used for deep learning imaging analysis.6This softwarewas developed following deep learning best practices with propri-etary training, testing, and comparison to the ground truth of a board-certified veterinary radiologist’s report. Multiple CNN models incorpo-rated into this AI-based software are trained to detect various featuresof radiographic images of the canine thorax. The software automati-cally produces an output report containing a collection of findings. Inparticular, these output reports determine the presence or absence ofCPE associated with left-sided CHF from canine radiographic images.The objective of this study was to determine the accuracy, sensi-tivity, and specificity of an AI-based software for diagnosing caninecardiogenic pulmonary edema from thoracic radiographs, using anACVR-certified veterinary radiologist’s interpretation as the referencestandard.2 MATERIALS AND METHODS2.1Experimental design and subject selectionThe study was a prospective, diagnostic accuracy design. Canine tho-racic radiographs made after-hours by the Emergency Departmentat the Animal Medical Center between January 15, 2020 and June14, 2020 were considered for inclusion. Approval from the AnimalMedical Center’s Institutional Animal Care and Use Committee wasobtained. All thoracic radiographs made between the hours of 18:00and 8:00 were considered eligible, during which time the DiagnosticImaging Department was unavailable. The inclusion criteria for thisstudy were determined by an ACVR-certified veterinary radiologist(A.F.): the radiographic examination was required to have at least oneview of the thorax. Age, breed, sex, body weight, and number of radio-graphic images were recorded for each enrolled patient as directed bythe veterinary radiologist (A.F.).2.2 Data recordingAll eligible thoracic radiograph DICOM files were electronically trans-ferred at 8:00 each morning to the AI software server (VetologyInnovations, San Diego, CA, USA) independent of patient history andradiologist interpretation. The specific CNN architecture, trainingmethods, and training datasets were key performance drivers of CNN-based software algorithms.16The selected AI software evaluatedradiographs for the presence of a range of diseases and produced aplain language report as output. We specifically examined a single fea-ture of this output: the determination of either a CPE+ or CPE− state.The algorithm to determine CPE disease was previously trained onradiographs of varied canine breeds, ages, geographies, and digital X-ray systems from diverse real-world cases with a broad range of CPEseverities. The software performed whole image analysis and craftedplain language radiology reports based on internal CNN model results.If the CPE+ likelihood exceeded internally defined thresholds, the soft-ware indicated CPE+ status within the output report, thus provid-ing the end user with a classification between CPE+ and CPE−.Theselected AI software ran on individual servers containing an 8-corecomputer (Intel Xeon CPU E5-2660 v3 @ 2.60 GHz, 48GB of DDR42666 MHz RAM) and solid-state drives. A load balancer distributedAI processing work between four servers for this trial. The AI serverrecorded a timestamp upon initiating image processing from a fullyreceived DICOM and a second timestamp when the software gener-ated a report. Lastly, the times when the study request was received bythe AI software’s server and when the AI-generated report was elec-tronically returned were recorded to calculate the report generationtime (RGT). The RGT value was calculated as the difference betweenthese two timestamps and thus did not include data transit times to andfrom the servers.The AI software inspected the species field of the DICOM labels toreject any indicated as noncanine. All other included metadata in the

Page 3

KIM ET AL. 3DICOM labels were discarded. The anonymized radiograph image datawere extracted and forwarded for automated image cropping. The AImodel was applied to assess these auto-cropped images. If an imagecould not successfully be auto-cropped, the AI software company’shuman technicians intervened to troubleshoot and provide an imagethat could be evaluated by the AI software. After the images were eval-uated, the AI software then translated the results into a single plainlanguage radiograph report. Finally, the report was automatically sentback to the requesting clinician. By reading these reports, we extracteda binary classification of each case between CPE+ and CPE− diseasestates attributed to the AI software. Images were excluded from statis-tical analysis if a report was not successfully generated by the AI soft-ware. Radiographs participating in this study were not distinguishedfrom the software’s regular workload.All images assessed by AI software were independently evalu-ated by one ACVR board-certified veterinary radiologist (A.F.) with15 years of post-graduate experience. The radiologist was blinded tothe patient’s history, signalment, and the results of the AI-generatedreport. All results were separated into two possible categories: car-diogenic pulmonary edema positive (CPE+) or cardiogenic pulmonaryedema negative (CPE−).2.3 Data analysisDescriptive analyses were performed by a data scientist (P.S.) with 2years of formal training in statistics using commercial statistical soft-ware (R version 4.0.3, Vienna, Austria; Microsoft Excel 2021, Red-mond, WA, USA). Estimated prevalence of CPE in the study populationwas calculated using a retrospective review of the hospital’s electronicmedical records database for all dogs who obtained thoracic radio-graphs and furosemide for treatment of presumptive left-sided CHF.In 2019, 1693 canine patients were evaluated using thoracic radio-graphs through the after-hours Emergency Department between thehours of 18:00 and 8:00 and 183 of these patients were treated withfurosemide during the same visit. Based on these data, the prevalenceof CPE was calculated to be 10.8%. Sample size calculations were thenmade to target high sensitivity, given the importance of minimizing therisk of false negative in this disease population. Utilizing Buderer’s for-mula we calculated an estimated sample size of 375 thoracic radio-graphic studies.17Standard calculations of accuracy, sensitivity, andspecificity were performed, using the radiologist’s assessment as thereference standard. For cases diagnosed by a radiologist as CPE+,pos-itive AI diagnoses were classified as True Positives (TP) while negativeAI diagnoses were classified as False Negatives (FN). For cases diag-nosed by a veterinary radiologist as CPE−, negative AI diagnoses wereclassified as True Negatives (TN), while positive AI diagnoses wereclassified as False Positives (FP) (Table 1). The sensitivity was calcu-lated as TP/(TP+FN). The specificity was calculated as TN/(TN+FP).Overall accuracy was calculated as (TP+TN)/(TP+TN+FP+FN). Con-fidence interval bounds were calculated using the Wilson score inter-val. The positive predictive value (PPV) and negative predictive value(NPV) were calculated using the following equations respectively:TA B L E 1 Basis for sensitivity and specificity calculationsdisplaying which studies were considered true positives, truenegatives, false positives, and false negativesRadiologist CPE+ Radiologist CPE−AI CPE+ True Positive (TP) False Positive (FP)AI CPE- False Negative(FN)True Negative(TN)Abbreviations: AI, artificial intelligence; CPE− cardiogenic pulmonaryedema negative.; CPE+, cardiogenic pulmonary edema positive.PPV = TP/(TP+FP) and NPV = TN/(TN+FN) (Table 1). Lastly, theYouden’s index was calculated as sensitivity + specificity – 1.Regression analyses were performed by an American College of Vet-erinary Emergency Critical Care-certified veterinary specialist (J.W.)with a doctoral degree in biosciences using commercial statistical soft-ware (Stata SE v15.1, College Station, TX, USA). A stepwise logis-tic regression analysis was applied to assess for independent asso-ciation between patient characteristics and disagreement of AI andradiologist assessment of the images. The model and goodness offit were evaluated by likelihood ratio and comparison to Chi squareassessment and all values evaluated as statistically significant demon-strated appropriate models. Bonferroni correction was applied formultiple comparisons. Statistical significance was set at a P valueof <0.05.3 RESULTSThoracic radiographs had been acquired using one of two availabledigital radiographic systems (Quantum Medical Imaging, Quantum HFRadiographic Imaging System, Ronkonkoma, NY). Both units had thesame flat panel digital radiographic image detector (Canon U.S.A., Inc.,Canon CXDI-50G, Melville, NY) with the same postprocessing algo-rithms optimized for thoracic studies. The technique varied depend-ing on the thickness of the animal with the kVp ranging from 85–100and the mAs from 4–5. All 500/500 cases received a CPE+ or CPE-diagnosis from the veterinary radiologist. The AI software produced aCPE diagnosis for 481/500 cases, with an analyzability rate of 96.2%.The 19/500 cases that did not receive a generated report from the AIsoftware were excluded from further analyses. In 12 of the 19 non-analyzable cases, the AI software did not make a fully automated CPEdiagnosis due to a specific failure to automatically crop images. In these12 cases, a human technician manually cropped images, after which AIsoftware produced and delivered a CPE diagnosis. The remaining 7/19cases did not have an AI generated report due to internal server error.Of the remaining 481 image sets, patient age ranged from 1 month to18 years (median, 9.2 years). Seventy one different breeds were rep-resented with 233 females (40 intact, 193 spayed), and 248 males (57intact, 191 neutered). Body weight ranged from 1 kg to 82 kg (median,9.3 kg). The number of images per study ranged from 1 to 6 with amedian, mean, and mode of 3.0. A total of 1441 radiographic imagesof the 481 image sets were included in this study.

Page 4

4 KIM ET AL.TA B L E 2 Comparison of CPE diagnosis of the ACVR-certifiedradiologist and AI software (n = 500) where radiologist provided adiagnosis for all 500 cases and the AI software provided a diagnosis for481 casesRadiologist AICPE+ 49 75CPE− 451 406Total 500 481Abbreviations: AI, artificial intelligence; CPE−, cardiogenic pulmonaryedema negative.; CPE+, cardiogenic pulmonary edema positive.TA B L E 3 Comparison of the CPE diagnosis of the ACVR-certifiedradiologist and AI software in the analyzable cases (n = 481) wherethe radiologist and AI software agreed on CPE+ for 42 cases andCPE− for 402 casesRadiologistCPE+RadiologistCPE−AI CPE+ 42 33AI CPE− 4 402Total 46 435Abbreviations: AI, artificial intelligence; CPE−, cardiogenic pulmonaryedema negative; CPE+, cardiogenic pulmonary edema positive.Radiographic evaluation by the veterinary radiologist reported49/500 CPE+ and 451/500 CPE- cases (Table 2). Of these, 3 CPE+ and16 CPE- image sets were excluded; therefore, the radiologist reported46/481 CPE+ and 435/481 CPE-. The AI software reported 75/481CPE+ and 406/481 CPE- (Table 2). Of the 46 CPE+ diagnoses by theradiologist, the AI software agreed with the diagnosis on 42 (92.3%)of the cases (Table 3). Comparing diagnosis by the radiologist withthat from AI, 42/481 were diagnosed CPE+ and 4/481 were diag-nosed CPE- by AI software (91.3% sensitivity, 95% confidence inter-val, 79.7% to 96.6%). Of the 435 cases diagnosed by the radiologistas CPE-, AI diagnosed 402 as CPE- and 33 as CPE+ (reflecting 92.4%specificity, 95% confidence interval, 89.5% to 94.5%). The positive pre-dictive value was 56% and negative predictive value was 99% basedupon AI diagnosis of cardiogenic pulmonary edema from canine tho-racic radiographs. The Youden’s index was 0.84.The AI software reported true diagnosis in all 444/481 cases,false diagnosis in 37/481 (accuracy, 92.3%, 95% confidence interval,89.6% to 94.4%). Of the 37 cases with a false diagnosis, 33 wereFP and 4 were FN. The FP cases included radiographic findings sug-gesting a normal thorax (n = 17), pleural effusion (n = 4), pneumo-nia (n = 4), cardiomegaly (n = 3), pulmonary neoplasia/metastasis(n = 2), chronic lower airway disease (2), and non-cardiogenic pul-monary edema (n = 4); a few of the studies displayed more than oneof the listed abnormalities.In order to assess for patient characteristics influencing the like-lihood of disagreement between AI and radiologist assessment ofimages, a regression model was built to include disagreement (yesvs. no) as a dependent variable. Independent variables included age andpatient body weight. A significant model emerged demonstrating thatFIGURE 1 Box plot displaying the significantly higher medianpatient age in the cases with discrepant CPE diagnosis compared tothe between the board-certified radiologist and AI software comparedto in cases where they agreed. The central line is median, upper andlower limits of the box represent 25% and 75% quantiles, the whiskersrepresent the range minus the outliers, and the outliers are designatedby dotpatient age influences the likelihood of disagreement (agreed: median10 y, range 0.08-18 vs disagreed: median 12 y, range 1–18; OR: 1.13;CI: 1.03-1.25; p = 0.005). This indicated that for each increase in ageby 1 year, the chances of disagreement increased by about 13%. (Fig-ure 1) No difference was found between agreement groups in regard topatient body weight (agreed: 8.5 kg, range 0.8-82 vs disagreed: 9.3 kg,range 1.6-51).The AI report generation time (RGT) was available for 460/481cases. If images were manually cropped and resubmitted by humantechnicians, the AI time stamp would reset so the RGT was notrecorded. For the 96% of cases for which timing data existed, the aver-age RGT was 2.45 minutes with a maximum of 8.3 minutes and a mini-mum of 0.65 minutes.4 DISCUSSIONThe present study demonstrated that AI software correctly identifiedCPE from canine thoracic radiographs in approximately every 9 out of10 cases diagnosed with CPE by a board-certified veterinary radiolo-gist. We identified a high NPV reflecting that the CPE- cases identifiedby AI had a high probability of being negative. This indicated that underpresent study parameters, an AI report indicating CPE- representeda very high likelihood (99%) to be in agreement with the radiologist’sinterpretation. On the other hand, the low PPV for CPE+ indicatedthat compared to a radiologist’s diagnosis, the AI assessment of truepositive CPE+ was realized 56% of the time. Thus, if an AI-generatedCPE− report is received by a clinician, left-sided congestive heart fail-ure is unlikely, and consideration should be given for further clinicalinvestigation or diagnostic testing. In general, our data from 481 radio-graphic studies suggests that in our study population, the AI software

Page 5

KIM ET AL. 5could be considered a good screening tool in the low CHF+ prevalencecanine population. These findings also suggest that the acuity of thepresent AI software versions may be enhanced when used in conjunc-tion with a clinician or radiologist’s assessment.Older patient age was linked to a higher likelihood of discrepancybetween the board-certified radiologist and AI in CPE diagnosis. Radio-graphic changes seen in the canine lungs of aging dogs without clinicalevidence of disease have long been reported and include pleural thick-ening, increased nonvascular linear markings, and nodular lesions.18Amore recent study has also confirmed the increased likelihood of thepresence of osseous metaplasia and lung collapse in the aging caninepopulation using CT.19It is possible that radiographic changes that areconsidered to be normal age-related changes may have led to misinter-pretation by the AI software. Future, prospective evaluations designedto specifically investigate age are indicated based on these findings. Asthere was no difference found between agreement groups in relationto the patient body weight, breed differences were not investigated asthey were unlikely to yield statistically significant data.Veterinary literature reporting a clinical application of AI-basedsoftware is limited. In one retrospective study, the reported accuracy,sensitivity, and specificity were similar between the ACVR-certifiedveterinary radiologist and AI when investigating left atrial enlarge-ment from 81 canine thoracic radiographs.13A recent prospectivestudy investigated the use of AI software in detecting thoracic lesionsfrom 120 canine and feline thoracic radiographs. Investigators demon-strated a significantly lower error rate in AI image interpretation ascompared to that of veterinarians with varying levels of experienceand veterinarians aided by the AI software that were all held againsta reference standard of an ECVDI or ACVR board-certified radiolo-gist’s interpretation.14Similar to the present study, these studies sup-port that AI can provide adjunctive data that may be useful in guiding aclinical diagnosis.Because the thoracic radiographs in the present study were submit-ted manually and consecutively for AI analysis during morning hours,this may have created a bottleneck effect for AI response time. Accord-ingly, it may be useful to consider the software’s capability for auto-matic downloading of thoracic radiographs to the AI software at thecompletion of each study in order to try to reduce software wait time.The authors are not aware of published data regarding the length oftime for board-certified veterinary radiologists to interpret a set ofthoracic radiographs and to generate a report. One study in humanmedicine tested the capability of radiology residents to read thoracicradiographs and estimated that the study could be appropriately eval-uated in 1.5 min.20This is below the average of 2.45 min that it took theAI software to perform in our study. However, this study only measuredthe time required for the radiology resident to identify the lesions andnot to generate a report. It would be straightforward for the AI soft-ware to produce a CPE determination directly from CNN outputs, but,as currently designed, only the plain language report is provided asoutput. Direct reporting of the binary classification for CPE instead ofproducing a plain language report could be considered in an effort toshorten RGT achieved by AI software. Further analysis of RGT in AIsoftware as compared to radiologist are warranted.Numerous studies investigating the application of AI in humans havereported a range of utility and outcomes when specific AI algorithmsare evaluated to detect a specific disease or condition. As of October2020, 64 AI/ML-based algorithms have been approved by the Foodand Drug Administration (FDA).21Application of AI systems has beensuggested for optimizing image acquisition and processing, enhancingpatient care, and optimizing workflow.22Nevertheless, the search fora role of AI in human medicine remains a work in progress,22–24and aseamless integration and clear-cut role of AI i n human and veterinarydiagnostic imaging has yet to be defined.25,26The present study has limitations that warrant discussion. Somecases had to be excluded from analyses due to failure of the AI soft-ware to generate a fully automated diagnosis. Possible causes couldhave included animal rotation or malpositioning, poor image quality,or failure to automatically crop for the specific area being evaluated.While human intervention helped to resolve this problem in our sam-ple, these cases were excluded from analyses to minimize potentialbias. We chose to have a single radiologist assess for CPE, thereforegeneralizability for our findings remains unknown. This decision wasmade to minimize possible confounding factors between investigatorsthat could have introduced type 2 errors. In future studies, it maybe beneficial to include multiple ACVR- or ECVDI-certified veterinaryradiologists. Another limitation was that the study was designed tocompare the ability of the AI to a veterinary radiologist who was sim-ilarly blinded to all patient information, clinical findings, and history.Final diagnostic imaging reports can list multiple differentials for lungpathology. To avoid ambiguity, the board-certified radiologist was givenabinaryoptionofCPE+ or CPE−. For these reasons, the AI softwareand board-certified radiologist were not compared to the final diag-nostic imaging report. Lastly, this study was not designed to assess theimpact of AI software on clinical outcomes in dogs with CPE. Futureinvestigations of the effect of AI on clinical decision-making and patientoutcomes may help to better understand the integration of these sys-tems in the management of dogs with CPE.In conclusion, findings from the present study supported the useof AI software for an initial screening assessment of canine thoracicradiographs for CPE when a veterinary radiologist is not available. TheAI software’s interpretations had a high NPV in this study, however,the comparatively low PPV suggested that the AI software had limita-tions and that confirmation of AI assessment by a veterinary radiologistremains important. Future studies are needed to assess the generaliz-ability of these study findings for varying hospitals and veterinary radi-ologist observers.LIST OF AUTHOR CONTRIBUTIONSCategory 1(a) Conception and Design: Kim, Fischetti, Weltman(b) Acquisition of Data: Kim, Fischetti(c) Analysis and Interpretation of Data: Sreetharan, Kim, Fischetti,Fox

Page 6

6 KIM ET AL.Category 2(a) Drafting the Article: Kim, Fischetti, Sreetharan, Fox, Weltman(b) Revising the Article for Intellectual Content: Kim, Fischetti,Sreetharan, Fox, WeltmanCategory 3(a) Final Approval of the Completed Article: Kim, Fischetti,Sreetharan, Fox, WeltmanCategory 4(a) Agreement to be accountable for all aspects of the work inensuring that questions related to the accuracy or integrityof any part of the work are appropriately investigated andresolved: Kim, Fischetti, Sreetharan, Fox, WeltmanCLAIM DISCLOSUREAlong with the advice of the ACVR and ECVDI’s AI Education andDevelopment Committee, the following was used for guidance inpreparation of this manuscript: Checklist for Artificial Intelligence inMedical Imaging (CLAIM): a guide for authors and reviewers. RadiolArtif Intell. 2020 Mar 25;2(2):e200029.CONFLICT OF INTERESTPratheev Sreetharan is a member of the development team of the arti-ficial intelligence software used in this study.ORCIDEunbee Kimhttps://orcid.org/0000-0002-6756-4324REFERENCES1. Keene BW, Atkins CE, Bonagura JD, et al. ACVIM consensus guidelinesfor the diagnosis and treatment of myxomatous mitral valve disease indogs. J Vet Intern Med. 2019; 33:1127-1140.2. Buchanan JW. Prevalence of cardiovascular disorders. In: Sisson D,Sydney Moïse N, Fox PR, eds. Textbook of Canine and Feline Cardiology:Principles and Clinical Practice. WB Saunders. 1999; 457-470.3. Thrall DE, Widmer WR. Textbook of Veterinary Diagnostic Radiology.Elsevier; 2018;684-709.4. Schober KEE, Hart TM, Stern JA, et al. Detection of congestive heartfailure in dogs by Doppler echocardiography. J Vet Intern Med. 2010;24:1358-1368.5. Ward JL, Lisciandro GR, Keene BW, Tou SP, DeFrancesco TC. Accuracyof point-of-care lung ultrasonography for the diagnosis of cardiogenicpulmonary edema in dogs and cats with acute dyspnea. JAmVetMedAssoc. 2017; 250:666-675.6. Chartrand G, Cheng PM, Voronstov E, et al. Deep learning: a primer forradiologists. Radiographics. 2017; 37:2113-2131.7. Choy G, Khalizadeh O, Michalski M, et al. Current applications andfuture impact of machine learning in radiology. Radiology. 2018;288:318-328.8. Willemink MJ, Koszek WA, Hardell C, et al. Preparing medical imagingdata for machine learning. Radiology. 2020; 295:4-15.9. Chassagnon G, Vakalopoulou M, Paragios N, Revel MP. Artificialintelligence applications for thoracic imaging. Eur J Radiol. 2020;123:1-6.10. Cicero M, Bilbily A, Colak E, et al. Training and validating a deep convo-lutional neural network for computer-aided detection and classifica-tion of abnormalities on frontal chest radiographs. Invest Radiol. 2017;52:281-287.11. Winkel DJ, Heye T, Weikert TJ, Boll DT, Stieltjes B. Evaluation of anAI-based detection software for acute findings in abdominal computedtomography scans toward an automated work list prioritization of rou-tine CT examinations. Invest Radiol. 2019; 54:55-59.12. Yoon Y, Hwang T, Lee T. Prediction of radiographic abnormalities bythe use of bag-of-features and convolutional neural networks. Vet J.2018; 237:43-48.13. Li S, Wang Z, Visser LC, Wisner ER, Cheng H. Pilot study: applicationof artificial intelligence for detecting left atrial enlargement on caninethoracic radiographs. Vet Radiol Ultrasound. 2020; 61:1-8.14. Boissady E, de La Comble A, Zhu X, Hespel A. Artificial intelligenceevaluating primary thoracic lesions has an overall lower error ratecompared to veterinarians in conjunction with the artificial intelli-gence. Vet Radiol Ultrasound. 2020; 61:1-9.15. Artificial intelligence. American College of Veterinary Radiology web-site. May 7, 2021. Accessed May 12, 2021. https://acvr.org/artificial-intelligence-in-veterinary-diagnostic-imaging-and-radiation-oncology/16. Simonyan K, Zisserman A, Very deep convolutional networks for large-scale image recognition. Paper presented at: International Conferenceon Learning Representations 2015; May 7–9, 2015; San Diego, USA.17. Haijan-Tilaki K. Sample size estimation in diagnostic test studies ofbiomedical informatics. J Biomed Inform. 2014; 48:193-204.18. Reif JS, Rhodes WH. The lungs of aged dogs. A radiographic-morphologic correlation. J Am Vet Radiol Soc. 1966; 7:5-11.19. Hornby NL, Lamb CR. Does the computed tomographic appearanceof the lung differ between young and old dogs?. Vet Radiol Ultrasound.2017; 1-6.20. Fabre C, Proisy M, Chapuis C, et al. Radiology residents’ skill level inchest x-ray reading. Diagn Interv Imaging. 2018; 99:361-370.21. Benjamens S, Dhunnoo P, Mesko B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an onlinedatabase. NPJ Digit Med. 2020; 118:1-8.22. Hardy M, Harvey H. Artificial intelligence in diagnostic imaging: impacton the radiography profession. Br J Radiol. 2020; 93:1108.23. Oren O, Gersh BJF, Bhatt DL. Artificial intelligence in medical imaging:switching from radiographic pathological data to clinically meaningfulendpoints. Lancet Digit Health. 2020; 2:486-488.24. Gampala S, Vankeshwaram V, Gadula SSP. Is artificial intelligence thenew friend for radiologists? A review article. Cureus. 2020; 12:1-7.25. O’Neill TJ, Xi Y, Browning T, et al. Active reprioritization of the read-ing worklist using artificial intelligence has a beneficial effect on theturnaround time for interpretation of head CT with intracranial hem-orrhage. Radiology. 2020; 3:2.26. Retson TA, Masutani EM, Golden D. Clinical performance and role ofexpert supervision of deep learning for cardiac ventricular volumetry:a validation study. Radiology. 2020; 2:4.How to cite this article: Kim E, Fischetti AJ, Sreetharan P,Weltman JG. Fox PR Comparison of artificial intelligence to theveterinary radiologist’s diagnosis of canine cardiogenicpulmonary edema. Vet Radiol Ultrasound. 2022;1-7.https://doi.org/10.1111/vru.13062

ACVR: Comparison of AI to the Veterinary Radiologist

Page 1

Page 2

Page 3

Page 4

Page 5

Page 6