What is PCGFractal Classification Dataset?

The PCGFractal dataset is an image database which exploits the Partitioned Iterated Function Systems (Partitioned Iterated Function Systems - PIFS) method to provide a very compact representation of the PCG signal, capturing its salient features by considering self-similaritiesrealized by applying the Partitioned Iterated Function Systems (PIFS).


It has been realized starting from the 1D signals contained into the PhysioNet Computing in Cardiology Challenge 2016 (2016 PhysioNet/CinC) dataset.

Background Information

Phonocardiography (PCG) is the graphic display of the sound waves produced by the heart.
The graphic representation of the characteristics of the sounds allows visualizing temporal relationships, precise duration, intensity and contours of the waves. This acquisition technique allows the medical staff to register and analyze, during the auscultation of the cardiac cycle, the audible sounds and murmurs produced by the movement of the structures of the heart and the turbulence in blood flow.
During heart functioning, there are two major tones, S1 and S2, generated by the vibration of the cardiovascular system.
These tones are audible during the cardiac cycle, which varies in intensity and duration. Between S1 and S2, a systolic sound is generated mainly by the closure of the atrioventricular valves. On the contrary, between S2 and S1, a diastolic sound is caused by the filling with the blood of the ventricles and their relaxing, see Figure below:

PCG_example

Over the years, many researchers have worked on the automatic classification of pathological and healthy heart sounds, but the distinction between the classes of interest is not-trivial. The data is easily influenced by the noise in the environment and heart sounds corresponding to different heart symptoms can be extremely indistinguishable. Thus, there are still challenges that require the development of more robust methods for the early diagnosis of cardiac abnormalities.

For this reason, this dataset wants to support researchers in the development of new algorithms of machine learning, signal processing, image analysis, classification and much more, related to the PCG signal.

Database Description

To realize this dataset, a transcoding process has been applied to transform the 1D input signal into an 2D color image by implementing two steps. The former, namely encoding, extracts a code from the 1D input signal, while the latter, called decoding, maps the code extracted in the previous step into a 2D color image.

workflow

All 1D signals have been taken from the PhysioNet Computing in Cardiology Challenge 2016 (2016 PhysioNet/CinC) dataset.

Details on how the transcoding process has been made are contained in the paper:
Riccio, D., Brancati, N., Sannino, G., Verde, L., & Frucci, M. (2023). CNN-based classification of phonocardiograms using fractal techniques. Biomedical Signal Processing and Control, 86, 105186.

This current dataset is useful for making analysis over the PCG signals, as for example classification tasks.

Records information

The PhysioNet Challenge training set consists of five databases (A through E) containing a total of 3,126 heart sound recordings, lasting from 5 seconds to just over 120 seconds.
Applying the transcoding process, we have realized different datasets. Making reference to our paper, here we are sharing the dataset of Exp. 1 and the dataset of the Exp. 5, which are composed as follows:

Exp.1
    Abnormal Normal Abnormal Normal
Augumentation Balancing # training # validation
NO NO 480 2345 151 150
total 2825 301

Exp.5
    Abnormal Normal Abnormal Normal
Augumentation Balancing # training # validation
YES YES 5080 5250 151 150
total 10330 301

Any research/publication based on this database is requested to cite:
Riccio, D., Brancati, N., Sannino, G., Verde, L., & Frucci, M. (2023). CNN-based classification of phonocardiograms using fractal techniques. Biomedical Signal Processing and Control, 86, 105186.

The Research Team

The team invoved for the realization of this dataset is composed by researchers from the "Artificial Intelligence in Image and Signal Analysis – AI-ISA" group of the Institute for high performance computing and networking (ICAR) of the National Research Coucil (CNR) of Italy, from the University of Naples "Federico II", and from the University of Campania "L. Vanvitelli".

Lear more about us by visiting our webpages!

Ready to use PCGFractal dataset?

If you have any question, please, do not hesitate to contact us!