Breast Cancer Detection Methodologies using Image Processing: Current Trends and Era in Machine Learning and Risk Mitigation
Abstract
Breast cancer is a potentially fatal disease for its sufferers. Treatment is beneficial in the event of an early diagnosis. Medical imaging is critical in detecting breast cancer early and accurately; nevertheless, it is plagued by false negative and false positive findings, which frequently lead to wrong diagnosis and therapy. The response time is also lengthy, resulting in delay in treatment for patients whose lives could have been saved if the condition had been recognised sooner. Increasing survival rates and decreasing treatment-related side effects of breast cancer have long been established as a goal of screening programs. Artificial intelligence is a vast discipline with numerous algorithms that improve breast cancer detection imaging modalities' selectivity, specificity, and accuracy. In the initial sections of the paper, various risk factors for the disease are highlighted, followed by an in-depth study of existing work related to different imaging processes involved in breast cancer detection. In the later sections of the paper, some very new deep learning algorithms are mentioned with their achievements in breast cancer detection. Various data sets available are also tabulated in this research paper with precision.
1. INTRODUCTION
Cancer, characterised by abnormal cell proliferation, is fatal. It usually results in the victim's death. It is a condition in which a specific type of body cell develops uncontrollably and compromises the function of neighbouring cells and tissues [1, 2]. The fatal process begins with cell injury and is followed by an uncontrolled rate of cell division, leading to the creation of a mass of tissues known as tumours. These tumours damage the patient's central nervous system, circulatory system, and other essential organs as they grow. The patient's condition will worsen if the tumours induce the body to release hormones, interrupting the body's normal functioning. Cancer is now usually referred by the organ in which it first manifests itself, such as throat cancer, lung cancer, breast cancer, stomach cancer, cervical cancer, and kidney cancer [3, 4].Even though both sexes are equally at risk, some cancers are more common in one gender than the other due to anatomical and biological differences. Breast cancer is an example of a female-specific cancer [5]. Cancer incidence and mortality have continually increased globally in recent years, posing a serious threat to human life and health. Breast cancer is one of the leading causes of death among women globally. One of the most noticeable patterns in the most recent global data on cancer in 2020 is the tremendous increase in the incidence of breast cancer, which has surpassed lung cancer as the most prevalent cancer worldwide. A pathology examination is the most accurate means of diagnosing breast cancer. The clinical diagnosis is based on signs and symptoms described by the patient, whereas the radiological diagnosis is based on medical imaging results. Researchers assist physicians in processing and analysing medical photos through imaging, medical image processing technology, computer analysis, and calculation, or a computer-aided diagnostic (CAD) system. Machine learning has been widely employed in breast cancer diagnosis since the emergence of CAD technology. Aside from the difficulties posed by the specific properties of histopathological images, accurate feature extraction is essential for automatic recognition of breast cancer histological images.. Today's standard approach to analysing breast cancer histopathology images relies heavily on synthetic characteristics. Male breast cancer accounts for just 0.2% of all malignant tumours. It primarily affects perimenopausal women and is rare under 25. Women make up about 50% of the world's population, and over 27% of cancer patients are breast cancer patients. Breast cancer mortality is second among all malignancies [6]. The milk-production glands in the breast are called lobules and the cancer of the breast originates in these tissues [7, 8]. Moreover, it is not contagious, although women with a family history are at a higher risk of its development [9, 10]. Fig. (1) represents risk factors that can allow breast cancer to develop inside a patient’s body [11-18].
The patient has a chance of survival if the condition is caught early. Cancer patients are at risk of death if therapy is delayed for any reason [19]. If a woman notices a lump in her breast, she should see a doctor immediately as it may be an early sign of breast cancer, which requires more clinical study and, ultimately, a surgical biopsy. It's not always the case that breast lumps indicate cancer. They can be harmless in some cases [20, 21]. This paper aims to provide a comprehensive overview of the many strategies for the early diagnosis of breast cancer. In order to assess their capabilities and identify areas of improvement that can be addressed using cutting-edge technology. This work has the potential to enhance the current system of early diagnosis of breast cancer to save lives and reduce the likelihood of false detection.
Fig. (2) represents the four quadrants in which the breast region is divided to study and analyse the probability of occurrence of tumour. Fig. (2) represents the probability of occurrence expressed in terms of percentage. Generally, left breast cancers are more likely to occur than bilateral ones. The early detection of fatal diseases finds its way into medical imaging processes also called cancer screening [22]. Breast cancer screenings include mammography. Mammograms can even detect disease without symptoms. Deceptive positive and negative reports lead to unnecessary biopsies and undetected malignancy, respectively. Double testing [23] to obtain a precise conclusion adds expense, time, and labour to the suspect. Computer-aided solutions [24] can detect and diagnose. These computer-aided solutions can help doctors analyse results more accurately and clearly, allowing the suspect to land on a certain test result [25].Several latest technologies can also assist in making the process more flawless and in favour of victim women as timely detection of the disease is the only tool to fight the deadly disease and defeat death. To make the most of the available technologies, relevant research must be reviewed to assess the untapped potential of underlying technologies and pinpoint the areas where more investigation is needed [26].
2. BREAST CANCER DETECTION
Once any patient is suspected of breast cancer, certain diagnostic tests are required for its diagnosis and prognosis. The confirmation of the disease can be performed in two basic steps
2.1. Breast Cancer Imaging
The primary goal of these imaging techniques is to pinpoint the location of the tumour; subsequent procedures are used to verify if the tumour is malignant. Mammograms, ultrasound, MRI, and tomosynthesis are all mentioned as potential methods of breast cancer diagnosis in this article. Breast cancer imaging approaches and the BIOPSY procedure for detecting breast cancer are contrasted in Fig. (3).
2.1.1. Mammographs
Mammographs are X-rays of the breast, taken at special angles to identify the hidden tumours in the breast. The technique uses ionisation radiation to detect these tumours. It becomes difficult to detect these tumours using mammographs with denser breasts. Generally, the accuracy of mammographs lies between 85% to 90%. Additional testing is used to get double sure about the disease after the first mammography. Debelee in 2019 elaborated on the different cancer imaging processes and techniques on which deep learning algorithms can be applied such as Screen-film mammography (SFM) is not a digital cancer screening imaging technique but is highly sensitive and is still in practice in countries like Ethiopia having the trade-off between dynamic range and contrast resolution [27, 28]. Digital Mammography DM is an advancement on film mammography as it is a digital technique hence is enabled with (Computer-aided design solutions) CAD and is quicker in response but suffers from low specificity [29]. Fig. (4) has categorized breast cancer into different grading categories from BIRADS 1 to BIRADS 6, as shown in Fig. (4). BIRADS is an acronym for breast imaging reporting and data system to grade the mammogram test result for breast cancer detection. It gives information about breast density and tests if any abnormalities are there that need further investigation. BIRADS is a numeric scale ranging from 1 to 6 (BIRADS 0 indicates incomplete test, and retest is required).
2.1.2. Ultrasound
Ultrasound uses sound waves to detect the location of the tumour. Knowing whether the mass discovered is a cyst filled with fluid or a solid mass is more helpful. It is less likely to be cancer if a cyst is found, but the probability of cancer increases if a solid mass is found. It acts as the guidance system in case of biopsy to guide the needle to take out the cancer for further tests. It also tests armpit lymph nodes. In thick breasts, it is utilised when mammograms cannot detect malignancy. US is adopted as a second choice to DM as it cannot be interpreted straightforwardly and is operator dependent. However, it doesn’t involve ionization radiation and is therefore safe in this regard [30, 31]. Fig. (5) distinguishes the cancerous lump on the basis offluid-solid mass after the ultrasound method.
2.1.3. Magnetic Resonance Imaging
MRI with special breast coils is a special equipment often named MRI equipment for breast cancer detection. MRI uses radio waves and very strong magnets for cancer detection. It is used for high-risk patients such as those women with a family history of cancer or are suspected to have cancerous growth in subsequent parts that are not being detected by mammographs etc. In this process, the medicine (MRI contrast agent) is injected to highlight the borders of the cancerous growths inside the breast and to let them be noticeable for investigation. MRI is beneficial for determining the degree of malignant growth, but it cannot be used as the first-line test for cancer detection because it can miss cancer that mammography can detect [32]. Fig. (6) recommends MRI screening for patients at high cancer risk.
2.1.4. Digital Breast Tomosynthesis
Tomography is the technique of imaging by sections using a penetrating wave. Tomography in combination with the 3-D reconstruction of image technique, is used for breast imaging to magnify the visibility of the lesions and this strong combination makes a process of breast cancer detection, termed as digital breast tomosynthesis (DBT) [33]. It is a volumetric reconstruction of 2-D images obtained by X-rays taken at different angles for different sectional views. It is similar to a 3-D mammograph as it also involves X-rays to screen breast cancer using an X-ray tube that moves around the breast in a circular arc shape to produce a 3D image [34, 35]. Fig. (7) classifies the digital breast tomosynthesis process into three further processes: lesion localisation, Lesion conspicuity and Micro-calcification (Table 1).
The imaging process is combined according to the prevailing situation of the cancerous growth present in the breast to confirm the actual status of the disease inside the patient’s body. Digital mammography is the basic preliminary examination practised nowadays and is followed in combination with other imaging processes to fetch other important details related to the suspected growth of tissues forming a lump or tumour. These combinations are also helpful in gathering the associated information like the nature of the mass (solid or fluid), and the extent of growth of the mass, and finding other such growths if they are hidden inside the dense breast layers. The enhancement in the outcome of the imaging processes for breast cancer detection when used in combination is schematically represented in Fig. (8). The two parameters of evaluation considered in this figure are the specificity and sensitivity and the outcome is also measured and evaluated on these bases. The two combinations are made in the preliminary process of digital mammography [31].
2.2. BIOPSY
Suppose any patient is found with suspected lesions in the breast. In that case, tissues are removed from this ROI located by any of the imaging processes discussed in Section 2.1 by clinical processes of removal of tissues for further testing. This process of removal of tissue from the patient’s breast is termed as biopsy. Further discussions on biopsy are beyond the scope of this paper. Fig. (9) shows the possible outcomes of biopsy, which can be cancerous or non-cancerous. The histopathological images are obtained after a biopsy is performed in case of breast cancer detection. These images are used to identify the cancer and its treatment to be followed as the next step by the patient. There exist several methods to obtain histo- pathological images.
3. ROLE OF ARTIFICIAL INTELLIGENCE IN BREAST CANCER IMAGING
In sections, 2.1 and 2.2 of this paper, different types of images are obtained after each process discussed so far in this paper. These images are analysed to draw conclusions related to the suspected growth of tissues located as ROI in the images. It is indeed a difficult task to recognise and identify these tumours manually and then to land on some conclusions based on the naked eye observation of the images obtained after different cancer detection methodologies discussed in the previous sections. Fig. (10) demonstrates the basic steps in image analysis such as segmentation, feature extraction, classification, and prediction.
The latest technological trends of data acquisition, pre-processing, and processing of the data can be used in alliance with medical imaging for deriving maximum information from the images obtained by different imaging processes as discussed in previous sections of the paper. Artificial Intelligence is the field in which an artificial brain is embedded in the machine with which the machine is expected to be as good as humans. This artificial brain is neural network composed artificially with a multi-layered architecture having many hidden layers. Many algorithms are currently working in this field for optimised usage and better performance. Once a machine is AI enabled, it has the inherent advantage of making decisions and reacting to the environment. Letting the machine learn from the experiences and surroundings is important here. Machine learning, is a subset of AI, that is specific in letting the machine learn. The learning of the machine is further categorised by the mode of learning, such as supervised learning or unsupervised learning. Supervised learning is a type of machine learning that requires training data to train the neural network. In the case of medical imaging, this training data is available in the form of different datasets. It’s a similar process to learning with a tutor where each answer is matched with the answer in the tutor’s mind and if the similar match is found, learning of the fact is stored in the student's mind. The datasets used are annotated and marked with ROI, etc. Unsupervised learning is not guided learning, but it is exploration-based learning. The data sets are not annotated and marked. Deep learning is a further subdomain of ML and AI. It can be categorised on the basis of the input data received at the input stage of the DL model. Artificial neural networks accept data in the form of numbers. Convolutional Neural networks receive data in the form of images. Recurrent neural networks accept data in the form of time series. In this paper, we are concentrating on medical imaging, thus the input is in the form of images, and hence our technology of interest is deep convolutional neural networks as a facilitator and enhancer of medical imaging. Fig. (11) establishes the relation between these technologies and highlights the work that can be used more in favour of medical imaging.
3.1. Deep Learning Convolutional Neural Networks for Breast Cancer Detection
Recent advances in data gathering have provided researchers with an excellent opportunity to obtain and analyse data linked to medical diagnosis, which they are doing to deploy various machine learning algorithms on it to automate and improve the accuracy of the screening process [45, 46]. Deep learning, a machine learning technology, has been developing for years [47]. With this method, the machine learns through experience, and a number of algorithms are designed to control and direct this learning process [48-49]. This technology can facilitate medical imaging for breast cancer screen testing, decrease the chances of falsified reports, and initiate appropriate response post diagnosis. The Deep learning accepts data in various forms, such as text, images, and time series. If the input data is in text format, then it is an artificial neural network (ANN) [50] for images it is convolutional neural network (CNN) [51, 52] and for time series, it is a recurrent neural network (RNN) The full-scale digital mammograph is usually a high pixel image of about 4000 x 3000 pixels and ROI, that is, the possible area occupied by a probable cancerous growth is 100 X 100 pixels, which is a limited zone of focus for classification of the lesions. This is one of the major challenges in this investigation especially beyond the known regions of clinical interpretation and investigation. Region-based convolutional neural networks(R-CNN) [53] along with its other variants [54-57] in deep learning techniques are used to establish object detection and classification algorithms in this case. Efforts are also seen in the work to train the neural network with datasets with full annotation and larger datasets with the status of cancer growth to make the algorithm more effective to use and apply. It is also observed that the pretraining method, such as ImageNet is also a favourable approach for letting a deep learning model be trained on a larger dataset. Shen [58] used a fully annotated dataset with ROI knowledge to pre-train the classifier model for local image patches. In this proposed work, the weight parameters associated with the local image classifiers are used to initialize the whole image weight parameters, which further get more refined while in the process. The patch and image classifiers referred in work are developed using a large dataset of fully digitised mammographs without information on ROI. They contain more than thousands of images which are further reduced to small digitised film mammographs containing one-tenth of the previously existing images. Computer-aided detection systems such as R2 Image checker, cenova 1.0 and iCAD Second look1.4 that do not involve any deep learning techniques are proven to be inferior to the approaches involving deep learning techniques in digitised mammographic platforms [58, 59]. Turkki [59] also used CNN to extract the local image descriptors to be used for the detection of breast cancer. Computational requirements are less stringent in this technique as CNN is trained on ImageNet [59, 60] opted the use of supervised and unsupervised CNN for the analysis obreast cancer histopathological images instead of mammographs .In this work, transfer learning techniques and deep learning tools are used to classify the histopathological images. Fther clustering analysis is performed on the classified images followed by autoencoder network with its dimension reduction functionality to map the extracted features with lower dimensional space [60]. Deep convolutional neural networks which are applicable for image analysis of images obtained for breast cancer detection are as shown in Fig. (12).The diseased data set for analysis and experimentation are acquired from these references. [61-65].
Wang [66] in his work, compared a deep learning strategy named as stacked deionising autoencoder (SAE) with existing machine learning benchmark classifiers such as Support vector machines, SVM, Linear discriminant analysis LDA and K-nearest neighbour KNN on the basis of accuracy, sensitivity, specificity and AUC and found SAE to be the most efficient and thus concluded the deep learning strategies to be more advanced for classifying the masses and lesions w.r.t benign or malignant classification. A recent work proposed the deep learning CNN inception V3 for the detection of lymph node metastasis on US images [67].The newly developed method of deep learning radiomics [68] to detect early breast cancer using the US images with shear wave elastography features SWE that measure the tissue stiffness and use colour maps to demonstrate the distribution of Shear wave velocity SWV. The combination has exhibited better performance in distinguishing benign and malignant lesions [69] multiple instances based on deep learning neural network for the determination of estrogenic receptor status, ERS using haematoxylin and eosin staining H&E to highlight the circular morphology, which proved to be cost-friendly, less time consuming and involved lesser number of variables for preparation [70].
4.APPLICATION OF AI ENABLED TECHNOLOGICAL ADVANCES IN MOST COMMON MEDICAL IMAGING PROCESSESS FOR RADIOLOGICAL DIAGNOSIS OF BREAST CANCER.
The most common medical imaging processes available for breast cancer detection are mammographs, ultrasound, magnetic resonance imaging, digital breast tomosynthesis. Further, these common processes are subjected to biopsy to find the tumour to be cancerous or not which also yield images for prediction and treatment. Thus, now realising the need for enhancement in the use of image analysis for better breast cancer prediction, the study in this section is categorised for different medical imaging processes utilising AI, ML, DCCN advanced tools for making the images more informative and reducing the response time for reading the images.
4.1. Breast Cancer Detection using AI-enabled Mammographs
Kim focussed on the mammography images collected from three different countries such as South Korea, the USA, UK, for developing a large-scale variable dataset to test an AI based algorithm for the detection and diagnosis of breast cancer. The average age of women is 50.3 years in the said data of mammography images.1,70,230 digital mammograms [71-74] have been considered in this work for the training of AI based algorithms collected between January 2000 to December 2018 The data set used in this work overcomes one major methodological deficiencies of inadequate data to train the AI algorithm. The proposed AI algorithm is two-stage trained algorithm based on ResNet-34 CNN architecture. Fully supervised learning is practiced using annotated mammograms in Stage I (Patch-learning) followed by semi-supervised learning using only mammograms in stage II (Image fine-tuning). Lunit INSIGHT MMG is used as a diagnostic support software for evaluating abnormality scores per breast [75]. Table 2 shows the systematic arrangement of the process of obtaining the AI enabled digital mammography images.
4.2. Breast Cancer Detection using AI Tools Enabled MRI
Adachi in [76] proposed the use of Retina Net to train an artificial intelligent system for detecting malignancy in MRI images. The AI system was also evaluated and compared with human readers on the basis of sensitivity, specificity, and AUC, and it was found that the AI system performance was better than the system without AI assistance. Fig. (13) represents the images obtained from the MRI process [76].
Imaging Process/ Refs. | Aim | Penetrating Waves Used |
Sample Image Benign |
Sample Image Malignant |
---|---|---|---|---|
Digital Mammography
DM [34, 36, 37] |
To locate the breast cancer using low energy x-rays (30kVp). | Ionisation radiations |
[38] |
[38] |
Ultrasound
US [31, 34] |
To identify whether the lump is a fluid filled mass or solid mass. | Acoustic waves |
[39] |
[39] |
Magnetic Resonance Induction
MRI [40, 41] |
To know the extent of already located cancer lump. | Radio waves |
[42] |
[42] |
Digital breast tomosynthesis
DBT [43] |
To locate hidden lesions skipped by mammographs. | Ionisation radiations |
[44] |
[44] |
[36] Yeon (2019)
[37] Song (2021)
[38] Huang(2020)
[39] Dhabyani(2019)
[40] Petrov(2014)
[41] Mohan (2013)
[42] Zhou (2020)
[43] Teertstra(2010)
[44] Jeong(2007)
Process/Refs. | Year | Before AI | Methods | After AI | Benefits | Gaps |
---|---|---|---|---|---|---|
Digital- Mammography [80] | 2020 | ResNet-34 CNN architecture, Batch instance normalisation, and deconvolutional module to overcome variance in pixel | AI tools were successfully implemented on DM images for breast cancer detection AI enabled mammography reports generated were better in performance for breast cancer detection than the group of radiologists without AI assistance Groups of radiologists with AI assistance were better in performance than the group of radiologists without AI assistance |
Reading time is not considered in the evaluation of performance | ||
Magnetic Resonance Imaging [81] | 2020 | Object detection using Retina Net using AI | AI tools were successfully implemented on MRI images for breast cancer detection.AI enabled MRI reports generated were better in performance for breast cancer detection than the group of radiologists without AI assistance. Group of radiologists with AI assistance was better in performance than the group of radiologists without AI assistance | Conversion of images into jpeg format and fat suppression causes loss of information | ||
Ultrasound Imaging [83] | 2019 | S-detect tool used with Samsung RS 80A Ultrasound system, BI-RADS 2003,2013 CNN | AI tools were successfully implemented on US images for breast cancer detection The s-detect tool enabled, US images were better in performance for breast cancer detection than the group of radiologists without AI assistance. Group of radiologists with AI assistance was better in performance than the group of radiologists without AI assistance |
Needs to be enhanced further to predict tumour mode metastasis and to diagnose types of breast cancer | ||
Digital Breast Tomosynthesis [90] | 2020 | DBT with DCNN on BI-RADS 20 | AI tools were successfully implemented on DBT images for breast cancer detection. AI enabled DBT images were better in performance for breast cancer detection than the group of radiologists without AI assistance. The group of radiologists with AI assistance was better in performance than the group of radiologists without AI assistance | There exists a difference in the reading time of the radiologists for this work and clinical practice |
[79] Adachi(2020)
[81] Wu (2019)
[87] Sechopoulos (2020)
These images are classified on the basis of the level of infection found by the radiologist at the time of imaging [77]. LabelMe, the graphic annotation tool, is used to annotate these images with a rectangular box and the respective annotations are stored in the COCO format. AI systems proposed in the work can come up with true or false reports. The aim is to make the system efficient enough to generate all true reports and none of the erroneous reports should be the outcome of the AI system. As both instances are likely to occur in the current setting, Fig. (14) depicts the many cases in which the patient obtained a negative final report following the MRI process for breast cancer diagnosis. In case (a) the report generated is correct, that is, the patient does not have breast cancer, in case (b) the patient does have breast cancer but the imaging process failed to detect it and generated a negative report, which is incorrect and can mislead the patient's treatment and diagnosis of the current situation of the disease present in the patient's body. The chance of the occurrence of case (b) must be lowered using AI and its enabling technologies, allowing the suspect of the lethal sickness to be sure about the outcome of its case (a) imaging process.
Another case is when when a suspect of the disease which is detected positive in the report of MRI process. Fig. (15) represents the cases in which the MRI process has generated a positive report. In case a and b the patient is detected positive when the patient is actually suffering from the disease in this way, the right treatment can be started at the right moment of time and the survival rate of the cancer victim can be increased..
4.3. Breast Cancer detection using AI tools enabled ultrasound
The S-detected AI technique is used to highlight breast lumps on ultrasound pictures of the breast, reducing the likelihood of false negative findings. In comparison to US pictures that are not backed by AI algorithms, it has increased overall specificity, sensitivity, and accuracy. Fig. (16) depicts the contour marking of the margins of cancer masses in US pictures [78].
The same kind of AI algorithms are applicable to mammographs and tomosynthesis images and the performance metrics of specificity, sensitivity and selectivity go high after applying AI algorithm on the normal tomosynthesis image for breast cancer detection, which in turn increases the probability of true positive true negative reports. Fig. (17) shows the diagrammatic representation of the AI enabled DBT images.
AI-enabled DBT images have reduced the reading time by 19%, and the enhanced dimensional perspective makes the lesion more identifiable, particularly those buried in folds, which become evident with 3D views when bookmarking and flagging the lesion [79]. The relevance of ethical behaviour must be proposed by the relevant authorities for the deployment of AI in breast cancer detection. Because it deals with the digitization of cancer patients' data, actions for data confidentiality are required for societal advantages and balance [80]. AI algorithms have been effectively deployed on several digital screening procedures for breast cancer detection, outperforming traditional methods without the support of AI-enabled imaging processes.This has undoubtedly improved the erroneous status of the reports generated by various imaging processes, and in addition to improving the accuracy of breast cancer diagnosis, the incorporation of AI algorithms to the digital screening process has resulted in a reduction in reading time [67, 81]. Researchers employing AI to improve breast cancer screenings discovered that using AI reduced radiologists' workload and improved the diagnosing process, avoiding patient deaths [82]. XGBoost defeated logistic regression for early cancer diagnosis; other algorithms used included Random Forest and deep neural networks [83].
5. VARIOUS STAGES INVOLVED IN RADIOLOGICAL ANALYSIS FOR BREAST CANCER
This section provides the layout of the steps involved in image analysis after the image is available for further prognosis and diagnosis. The latest AI algorithms used in different stages are also mentioned and compared.
5.1. Stage 1 (Image acquisition)
Although there are many ways of image acquisition for breast cancer detection, in the scope of this paper, the five most common ways discussed so far are mammographs, ultrasound, MRI’s and tomosynthesis. These images are then converted into different formats for the investigation of cancer in the suspected area of the breast.
5.2. Stage 2 (Image Pre-processing)
In this stage, the image is subjected to a variety of steps for its pre-processing and analysis, such as its change in orientation, removal of noise and artifacts, improving the contrast and appearance such as reduction in blurring in the image.
5.3. Stage 3 (Image processing)
This step includes colour processing, Multi-resolution processing, compression of the image to reduce its size to a smaller one with minimum deterioration in the quality of the image, morphological processing to extract useful components, descripting the shape of the image are some of the commonly used techniques employed on the images obtained from sensor captures using different imaging modalities for breast cancer detection.
5.4. Stage 4 (Segmentation)
It is a technique for segmenting an image in order to better study it for further investigation; the pixel set of segments of interest is a super pixel. It is typically applied to photographs in order to minimise their complexity for analysis. This procedure labels the pixels and organises them into different categories in order to prioritise one group of pixel categories over another based on the necessity for observation from those images, thereby distinguishing with lines and curves. In the field of medical imaging, these segmentation approaches are commonly used on digital mammograms, where they outperformed the K-means algorithm for segmentation [84, 85]. The combination of the K-means algorithm and the area growing algorithm produces better results because the k-means technique is used for segmentation and the region expanding algorithm is used to eliminate the pectoral muscles from the denser breast regions [86]. The combination of a grow cut with a gaussian mixture model GCGMM as a segmentation strategy on MRI images of breast cancer lesions for mass and non-mass enhancement, as well as background parenchymal enhancement, reached a 95% accuracy [87]. Breast parenchymal tissue is segmented from the air and other tissues using a 3-D multiplanar multiprotocol CNN-based segmentation technique [88]. Convolutional neural networks outperformed the watershed method for segmenting skin, fibro glandular, bulk, and fatty tissues on 3-dimensional ultrasound images, achieving a JSI of 85.1%. (Table 3).
5.5. Stage 5 (Feature Extraction)
It is the method to reduce redundant data in the image. The larger data in the image means a larger number of variables, hence more computation is needed. Feature extraction helps to group these variables in useful features of the image, which are easy to process. Table 4 represents different traditional and deep learning-based feature extraction methods, out of which it has been established by many of the existing research works that deep learning methods of feature extraction can extract more complex features and are more robust in operations such as scale, occlusion, rotation etc [89, 90].
5.6. Stage 6 (Classification and Prediction)
From the above-mentioned stages, features are extracted and, on that basis, the observed instances are categorised into different classes for prediction of breast cancer for further prognosis. Many algorithms have been designed in literature using AI, ML, and deep learning techniques for this purpose. Some of the very known algorithms are naïve Bayes, random forest, k-nearest neighbour, support vector machine, artificial neural networks, and decision tree. These are all majorly machine learning algorithms. Deep learning algorithms are a bit more advanced for the purpose such as such as logistic regression, convolutional neural networks, ResNet34, etc. Prediction is generally made for being malignant or benign. Python is used as the coding language for the above algorithms (Table 5).
Type/Refs. | Segmentation Algorithm used | Image before Segmentation | Image after Segmentation | Achievement |
---|---|---|---|---|
Mammograph images [88, 89, 93] | New proposed algorithm as an advancement on K-means Algorithm | Attained best result in accuracy-89% (benign) | ||
Combination of K-means algorithm for segmentation and region-growing algorithm KMRG for removal of pectoral muscles | Attained best result using structural similarity index SSIM-92.47% | |||
Improved Threshold based fully automated trained segmentation algorithm for separation of ROI from pectoral muscles | Attained best result in accuracy-98.13% (mini MIAS) 100% IN Breast and 99.8% for BCDR | |||
DCE MRI images [90, 91] | Improved segmentation with volumetric delineations such as Grow cut gaussian mixture model approach GCGMM for mass and non-mass enhancement a background parenchymal enhancement | Attained best result with accuracy -95% It is a volumetric segmentation approach and is recognised as most reproducible segmentation |
||
Multiplanar 3-dimensional breast segmentation using U-net CNN deep neural network training algorithm for segmentation of parenchyma of breast from air and other tissues. | Attained media dice similarity index of 96.60% (+-.30%) and 100% neoplastic lesion coverage | |||
US Images [92] | CNN based automated segmentation method on 3-dimensional ultrasound images for skin, fibro glandular, mass and fatty tissues | Attained Jaccard similarity index-(JSI)-85.1% which outperformed watershed algorithm index-75.54% |
[89] Khoulqi (2018)
[90] Veeraraghavan (2018)
[91] Piantadosi (2019)
[92] Xu (2018)
[93] Zebari (2020)
Feature Extraction Methods For Breast Cancer Imaging Analysis | ||||
---|---|---|---|---|
Traditional Methods Before Involving Deep Learning Convolutional Neural Networks For Feature Extraction | ||||
Name of the Method | Function Used | Principle Involved | Limitations | Feature Extracted |
Harris Corner Detection | Gaussian window function | Difference in intensity of displacement in all directions | Rotation invariant but it is not scale invariant | Corner detection |
Shi-Tomasi Corner Detector | Modified scoring function used in Harris corner detection method | If it is greater than threshold value then it is considered corner | Rotation invariant but it is not scale invariant | Better corner detection |
Scale-Invariant Feature Transform (SIFT) | Scale space extrema detection, Key point Localisation, Orientation Assignment, Key point descriptor, key point matching. | Key point matching between nearest neigh bours | Slow in speed | Scale invariant |
Speeded-Up Robust Features (SURF) | Laplacian of Gaussian with Difference of Gaussian for finding scale-space. | Three times faster than SIFT Compares the contrast of features | not good at handling viewpoint change and illumination change. | Faster matching, without reducing the descriptor's performance. feature descriptor has an extended 128-dimension version. |
Features from Accelerated Segment Test (FAST) | Non-Maximum Suppression | Store 16-pixel points in circle around feature point as a vector | Not much robust to high level noise | Corner detection with much faster speed |
Binary Robust Independent Elementary Features (BRIEF) | Hamming Distance to match these descriptors | reduces the memory usage by converting descriptors in floating point numbers to binary strings. | Less recognition in large in plane rotation | Cannot do feature extraction by itself can be used with SIFT and SURF etc |
Deep Learning Methods: After involving deep learning convolutional neural networks for feature extraction | ||||
Name of the method | Function used | Principle involved | Limitation | Feature extracted |
Super Point: Self-Supervised Interest Point Detection and Description | VGG style encode for feature extraction followed by two decoders one for point detection and other for point description | CNN computes like SIFT in single forward pass | Needs to be enhanced for semantic segmentation | Point detection and point description |
D2-Net: A Trainable CNN for Joint Description and Detection of Local Features | VGG -16 architecture pertained on image net | It is a CNN working on principle of detect and describe 3D reconstruction and local visualisation. For image matching its baseline is Root SIFT |
Needs to be enhanced for accuracy of key points | Joint description and detection of local features |
LF-Net: Learning Local Features from Images | Feature map generation using Res-NET Scale-invariant key point detection. Orientation estimation. Descriptor extraction. |
LF-Net is high density, multi-scale CNN that returns key point locations, scales, and orientations. | Needs to be improved for larger frame differences | LF-Net out performs SURF by 39% |
Deep Graphical Feature Learning for the Feature Matching Problem | Graph Neural network converts weak local geometric features to rich local features for efficient feature matching | Compositional message passing neural networks are used and CNN is viewed as graph neural network on grid graph where each grid is a feature vector. | Not mentioned | Local features obtained by graph neral network when used with different algorithms can outperform traditional feature extraction methods |
Ref. | Year | Countries | Imaging Modality | AI Algorithm | Achievement |
---|---|---|---|---|---|
[82] | 2020 | USA | Digital Mammograms | Cancer-detector triage algorithm | - |
[83] | 2020 | China | Digital Screening | Extreme gradient Boosting algorithm | Better performance over logistic regression, random forest, deep neural network algorithm |
[93] | 2020 | China | Digital screening | Efficient Ada boost algorithm | Accuracy -97.2% Sensitivity -98.3% Specificity -96.5% |
[94] | 2020 | Egypt | Digital Mammograms | Chaotic salp swarm algorithm | Segmentation algorithm extracted the of ROI from entire image |
[95] | 2020 | India | Digital mammograms | Firefly Algorithm | LSNR superseded FA in accuracy as a classifier by 7% |
[96] | 2021 | China, data set from UCI | Digital screening | Surrogate assisted Firefly algorithm | Reduced computation cost |
[97] | 2021 | Not mentioned | Digital mammographs | Particle swarm optimisation (PSO) using naïve bayes classifier | PSO got the highest accuracy of 98% over BAT algorithm, Ant Colony |
[98] | 2020 | India BCDR-FC03, Local mammographic data |
Digital mammographs | Harmony search and simulated annealing (HS-SA) algorithm | Accuracy and for for local mammographic data set =99.89% and for BCDAR F03= 99.76% |
Ref. | Name of Data Set | Imaging Processes | Key Remarks |
---|---|---|---|
[99] | Collected under HIPAA | FFDM=245 US=1125 DCE-MRI=690 |
Smaller data set lacks parameter optimisation |
[39] | Collected from Baheya hospital, stored in DICOM format | US =1100 | After pre-processing 780 images |
[100] | BreCaHAD Microscopic biopsy images stored in .TIFF format |
Histopathological images =162 | limited pixel tonal value of images |
[101] | WISCONSIN data base (WDBC) | FNA Tests of breast mass | Accuracy 99.04% |
[102] | CAMLEYON 16 Collected from RUMC Netherlands |
Histopathological images =1322 Annotated images |
Generalised and robust dataset of annotated images |
[103] | Breast cancer Coimbra dataset BCCD | Biomarker =116 instances |
SVM algorithm Sensitivity =88% |
[104] | CBIS-DDSM | Digital mammographs | Accuracy -71.01% |
[105] | MIAS | Digital mammographs - 322 |
Normal -133 Abnormal-189 Segmentation needs to be improvised for greater accuracy |
[106] | WDBC | Instances -569 | Accuracy -96.5% |
6. CO-EVOLUTIONARY COMPUTATION-BASED AI ALGORITHMS USED IN RECENT WORKS FOR RADIOLOGICAL ANALYSIS OF BREAST CANCER
6.1. Data Sets used to Train Algorithms for Early Breast Cancer Detection
Any neural network or artificial intelligent system needs an effective data set for its training. Many times, the features of an artificial intelligent system are not fully curated because of the inadequate data set on which it is trained. Thus the data set plays a very vital role for any artificial intelligent system to be efficient in accuracy. Table 6 provides information on some datasets that are available for breast cancer detection using imaging methods enabled by deep learning convolutional neural networks. The dataset collected under HIPPA is a smaller dataset and lacks parameter optimisation [99]. BreCaHAD dataset is a collection of 162 histopathological images that have limited pixel tonal value [100] and CAMLEYON 16 is collected from RUMC Netherlands is a bigger dataset of 1322 histopathological annotated images [101].Some more datasets are also mentioned in the Table 6 with their key remarks.
CONCLUSION
In this work, a detailed investigation of breast cancer detection is conducted, and many findings are drawn from the in-depth understanding obtained, which are useful for researchers to push the work ahead for the benefit of patients and suspects of breast cancer. It has the second highest mortality rate, and its treatment sometimes involves removing a woman's breasts, leaving her with a decreased feeling of femininity. The person has a chance of survival only if the sickness is diagnosed in its early stages. False positives and negatives can be prevented by correctly diagnosing a breast lump (tumour). Mammograms are commonly employed as the initial step in diagnosing breast cancer and localising any suspicious lesions, according to the study cited in this publication. MRIs are performed on high-risk patients or on individuals with thicker breasts, as a needle guiding facilitator, and for chemotherapy to determine the amount of recovery from previous therapy. Ultrasounds are used to determine if a suspected tumour is fluid-filled (less likely to be cancerous) or solid (more likely to be cancerous). Breast digital tomosynthesis provides a sectional image of hidden layers in the breast, potentially identifying tumours missed by mammograms. Patients undergoing biopsy have a sample removed from their breasts, and we then get histopathology images, which are then subjected to image analysis to determine whether the suspected growth of tissue is malignant or not. Medical imaging systems outfitted with sophisticated AI capabilities, such as deep learning convolutional neural networks, have surpassed traditional imaging methods across all imaging modalities and phases. These approaches can be enhanced in the future for better performance and faster outcomes [107-110].
CONSENT FOR PUBLICATION
Not applicable.
AVAILABILITY OF DATA AND MATERIALS
The data supporting the findings of the article is available in the (National Library of Medicine) at (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6007545/), reference number [102]”.
FUNDING
None.
CONFLICT OF INTEREST
Dr. Ayush Dogra is the editorial advisory board member of the journal The Open Neuroimaging Journal.
ACKNOWLEDGEMENTS
Declared None.