This study was approved by the Ethics Committee of Pirkanmaa Hospital District regarding culturing and differentiating of human iPSC lines (R08070). All experimental methods related to hiPSC-CMs have been described earlier.14 Briefly, studied iPSC cell lines included six CPVT lines generated from CPVT patients carrying RyR2 mutations including exon 3 deletion, and point mutations P2328S, T2538R, L4115F, Q4201R and V4653F. The iPSCs were differentiated into spontaneously beating CMs using the END2 differentiation method13 and dissociated to single-cell level for calcium imaging studies, which were conducted with spontaneously beating Fura-2 AM (Invitrogen, Molecular Probes) loaded CMs. Calcium transient signals were measured with inverted IX70 microscope with a UApo/340 x20 air objective (Olympus Corporation, Hamburg, Germany) with an ANDOR iXon 885 CCD camera (Andor Technology, Belfast, Northern Ireland) and a Polychrome V light source by a real time DSP control unit and TILLvisION or Live Acquisition (TILL Photonics, Munich, Germany) softwares. Calcium signals were acquired as the ratio of the emissions at 340/380 nm wavelengths, and background noise was subtracted before further processing. For drug studies, the changes in calcium were recorded during spontaneous baseline beating, spontaneous beating after exposure to 1 µM adrenaline and spontaneous beating after exposure to 1 µM adrenaline together with 10 µM dantrolene (Sigma). If calcium transient abnormalities were detected after exposure to adrenaline, cells were exposed to dantrolene to see the potential antiarrhythmic response.
Data Computed from Calcium Transient SignalsWe have designed and implemented a computational method to recognize beats or peaks of calcium transient signals originating from iPSC-derived cardiomyocytes.7,8,–9 The recognition of peaks was based on the computation of first derivative using successive short segments of a few samples from the beginning of a transient signal to its end. When first derivative values increased from roughly zero values rapidly to positive values the beginning of a peak was met, then decreasing first derivative values back close to zero its maximum was found and finally after rapid change to negative values again close to zero the end of the peak was observed. Very small peaks containing smaller amplitudes approximately less than 8% compared with those of the large amplitude peaks in the signal were left out as potential noise.
A biotechnology expert determined whether an iPSC-derived cardiomyocyte had generated normally or abnormally beating cycles. This was also mainly applied in the present study, since this is our first case to apply machine learning methods to drug research with calcium transient signals and we wanted to be as sure as possible with regard to decisions to which type each transient signal should be labelled being central for creating highly qualified training sets for machine learning tests.
We used the data computed from six CPVT cell lines altogether containing 128 calcium transient signals for each of the baseline, adrenaline and dantrolene conditions. The biotechnology expert labelled the signals affected by dantrolene to three classes called responder, semi-responder and non-responder. In a responder signal dantrolene abolished all the calcium cycling abnormalities, which therefore included only normal calcium peaks. In a semi-responder signal dantrolene reduced abnormalities by more than 50% causing the signal to comprise of some abnormally shaped calcium peaks. In a non-responder signal dantrolene reduced abnormalities by less than 50% causing the signal to consist of clearly abnormally shaped calcium peaks. Figures 1 and 2 show example transient signal segments. Table 1 presents their numbers for six CPVT cell lines. In Table 1, cardiomyocytes were first treated with adrenaline and then the effect of dantrolene was analyzed. Only those cardiomyocytes with adrenaline-induced arrhythmias were studied. Cell lines 1, 2 and 4 were mainly responders and most arrhythmias were abolished with dantrolene. Cell lines 5 and 6 were mostly non-responders and dantrolene did not abolish the arrhythmias. Cell line 3 had equal amount of responders, semi-responders and non-responders.
Figure 1A baseline calcium transient signal segment of around 10 s measured from an iPSC-derived cardiomyocyte in association with CPVT disease. The peaks were detected by the signal recognition algorithm evaluating all of them to be rather normally shaped calcium peaks close the similar size.
Figure 2(a) An adrenaline exposure signal segment and (b) its dantrolene responder signal including regular peaks of a roughly similar size, (c) an adrenaline exposure signal segment and (d) its semi-responder containing slight irregularity, and (e) an adrenaline exposure signal segment and (f) its non-responder containing rather similar irregularity as before dantrolene.
Table 1 Numbers of responder, semi-responder and non-responder signals after affecting by dantrolene when 15, 30, 17, 31, 22 and 13 transient signals were measured from six CPVT cell lines, and numbers of recognized peaks. Data Computed from Peaks of Calcium Transient SignalsIn our data there were baseline signals, adrenaline signals and dantrolene (responder, semi-responder or non-responder) signals, 128 of them in each of three sets. In order to enable an analysis subject to their relations, we first computed variable values from all their valid peaks recognized in the preceding phase. Their computation had been presented in detail previously.7,8,–9 They are illustrated in Fig. 3. We applied 12 different peak variables9: amplitudes Al and Ar of peak left and right sides, their durations Dl and Dr, their maximum max(s’) (from peak left side) and absolute minimum |min(s’)| (from peak right side) for the first derivative s’, maximum max(s’’) and absolute minimum |min(s’’)| of the second derivative s’’ from the peak right side, peak surface R area between the peak curve and the line between the peak beginning and end, duration Δ from the peak maximum back to that of the preceding peak or, if this non-existent, back to the beginning of the signal, duration dl from the peak beginning to the location of the first derivative maximum (inside the left peak side), and duration dr from the location of the peak maximum to the location of the first derivative absolute minimum (inside the peak right side).
Figure 3Peak amplitudes Al and Ar, durations Dl and Dr, approximate location L1 for the computation of max(s’) (first derivative), approximate location L2 for |min(s′)|, approximate location L3 for |min(s′′)| (second derivative) and approximate location L4 for max(s′′), surface area R, duration Δ from the preceding peak, duration dl from the peak beginning to L1, and duration dr from the peak maximum to L2.
In Table 2 the means of the results differ in most cases if we compare variable by variable between all possible pairs of baseline, adrenaline and dantrolene response, semi-response and non-response. Nevertheless, there are also some such pairs in which differences are small. Frequently standard deviations are relatively great compared to the means in the same cells. All 12 variables are used jointly in actual machine learning analysis Thus, the differences of the means and standard deviations of the single variables do not predict inevitably how effectively four classes of adrenaline vs. dantrolene response, semi-response and non-response could be separated from each other. The machine learning analysis is considered in the following section.
Table 2 Means and standard deviations of peak variables for baseline, adrenaline and dantrolene signals (responder, semi-responder and non-responder): amplitudes Al and Ar, durations Dl and Dr, maximum and absolute minimum of s′, maximum and absolute minimum of s′′, peak area R, time difference, and durations dl and dr. Note that before the peak recognition the amplitude values in all signals were multiplied by 1000 compared with those in Figs. 1 and 2. Classification Methods Applied to Separation of Signal ClassesGenerally speaking, in machine learning there are three issues that often are the most critical ones related to the success of applying machine learning methods to real world problems. These issues are selecting the right variables, constructing training and test sets properly for the machine learning methods and finding the right hyperparameter values for the algorithms. When these issues are solved with respect to the problem handled, in many cases the results are good or satisfactory at least. For the separation of the transient signal classes we used all the 12 variables consistently with all classification algorithms.
The classification or separation among five different calcium signal classes (baseline, adrenaline, responder, semi-responder and non-responder) is based on several machine learning algorithms. Testing of several classification algorithms is necessary in practice in order to obtain as wide empirical evidence as possible which algorithm would be the most suitable for the application considered. However, we need to remember that the application examined in this paper includes several research lines, which are covered in the following section in a more detailed way. Since the machine learning research in this paper is application oriented by nature, each one of the research lines requires a separate and detailed analysis and the analysis consists of in this case the selection of the most suitable classification algorithm and hyperparameter values, if necessary. Because all research lines have their own special characteristics such as different dataset and/or class distribution between each other, we cannot guarantee that there is only one classification algorithm, which would outperform other methods tested in all possible research lines. Machine learning is in practice tailoring algorithms to work in a specific domain. The best results presented in the next section are directional for researchers and/or practitioners who work with the same kind of research problem as described in this paper. The results give perspective which methods would be the best ones for the similar research problems what one is examining. Fine-tuning of an algorithm like selecting the optimal hyperparameter values are always data and problem dependent so there is not any clear guidelines how to select the values optimally.
The methods tested are the same as in our earlier studies.8,9K-nearest neighbor nearest searching (KNN) was applied with Chebychev metric, with cityblock (Manhattan) metric, with correlation measure, with cosine measure, with Euclidean metric, with Mahalanobis measure, with standardized Euclidean metric and with Spearman measure. All of them were tested with equal, inverse or squared inverse weighting, naturally, and all the measure and weighting combinations were tested with odd K values from 1 to 37 (number of nearest neighbors searched). The selection of odd K values is justified on the basis of tie exclusion that may happen if even K value is used.
Besides KNN algorithm, linear, quadratic and Mahalanobis discriminant analysis were examined as well as decision trees (CART), multinomial logistic regression (logistic regression in binary case), Naïve Bayes with normal distribution, Naïve Bayes with kernel density estimation and with the normal, box, Epanechnikov and triangle kernels were examined. Random forests were also investigated and the number of trees tested in a forest ranged from 1 to 100 with step size of 1. When a random forest has only one tree in a forest, the tree structure differs from the tree structure given by the CART algorithm. Furthermore, least squares support vector machines (LS-SVMs) were used with the linear, quadratic, cubic, and radial basis function (RBF) kernels. Test set-ups included both binary and multi-class classification schemes and, hence, with LS-SVMs also a multi-class extension of LS-SVM was required. In this study a hierarchical approach of LS-SVM was used that was similar to a method applied in Ref. 6 More specifically, Fig. 4 represents the hierarchical LS-SVM used in this study. With LS-SVMs boxconstraint (C) and σ parameter (encountered in RBF kernel) had the same parameter value space of {2−12, 2−11,…, 217}.
In other words, the polynomial kernels were tested with 30 different values of C and RBF kernel with 900 (C,σ) combinations.
Figure 4Hierarchical LS-SVM structure used to separate non-responder (NR), responder (R) and semi-responder (SR). Each inner node in the structure consists of a binary LS-SVM classifier.
Now the variables and tested hyperparameter values are explained from the three essential issues with respect to machine learning algorithms. In this paper we used leave-one-out (LOO) method in a signal level for classification. Here, in each LOO round the data from one signal is left for test set and the rest of the data forms a training set. We must also notice that the data of one signal is a collection of data derived from the peaks within the signal. Hence, every row in a test set corresponds to a data gained from one peak within a signal. Since we are dealing with signal classification, we need to separate signal level and peak level information from each other. A classification algorithm forms a model based on peak level information and gives prediction for each row (peak) in a test set. However, the transformation from peak level prediction into signal level prediction needs to be made in order to achieve the signal level classification result. This transformation is made based on majority voting method. In other words, we take the mode of test set predictions (test set consists of only data from one signal) to gain the signal level classification. Since mode can be unambiguously determined (for example, test set includes 10 rows of data and 5 rows obtains class A as predicted class label and the rest 5 rows obtains class B as a predicted class label), a strategy for ties must be developed. In this paper we used the same strategy as in Refs. 7 and 9 to solve the ties, so a reader can find the detailed description about the tie solving strategy from the given references. After LOO procedure, we have obtained a signal level prediction for each signal in a dataset. Hence, we can compare the predicted class labels to ground truth class labels and define a confusion matrix. From the confusion matrix we evaluate accuracy (trace of confusion matrix divided by the sum of elements in a confusion matrix) and other evaluation measures (true positive, TP). When a classification algorithm required parameter tuning, we performed LOO procedure with all parameter values tested and selected parameter value that gained the highest accuracy. The reason behind the use of LOO is the lack of data. Machine learning algorithms usually require a lot of data to work well and to produce a reliable model for prediction. With the use of LOO we maximized the size of training data. Before any classification, we applied z-score standardization to the whole dataset in order to obtain all variables equally important.
Classification ResultsIn the following, only the best classification method is mentioned for every test set-up. The performance was evaluated by applying true positive, false positive and negative in confusion matrices, and classification accuracy as usual. First, we classified among three dantrolene transient signal classes to study how well these can be separated from each other. Table 3 where correctly classified are presented along the diagonal presents the best results generated by random forests with 14 trees and giving classification accuracy of 65.6% and sensitivities (true positive rates) of 79.7% for responders, 35.7% for semi-responders and 65.9% for non-responders.
Table 3 Classification of three dantrolene classes: responders (R), semi-responders (SR) and non-responders (NR).According to Table 3 semi-responders were classified worse than the other two. The results in Table 3 denote that the two classes of responders and semi-responders may resemble somewhat each other, because more semi-responders (12 signals) were incorrectly classified into the class of responders than correctly to semi-responders (10 signals). Furthermore, on the basis of Table 1 the number 28 of semi-responder signals being the minority class here is less than 59 and 41 of responders and non-responders. The poor results of semi-responders may partly be caused by their characteristics of being between the responders and non-responders as Fig. 2 showed.
We united the responders and semi-responders (RSR) and then computed results given in Table 4. Accuracy is now 78.9%, when sensitivities are 90.8% for RSR and 53.7% for NR produced by random forests with 36 trees. Here the quite poor sensitivity of the non-responders might be inflicted by their minority in the data, 32% of all.
Table 4 Classification of dantrolene classes: responders merged with semi-responders (RSR), and non-responders (NR).Next we studied classification when the semi-responder and non-responder signals are merged. In Table 5 their results are shown producing the accuracy of 73.4% with sensitivities of 69.5% for responders (R) and 76.8% merged semi-responders and non-responders (SNR) by K-nearest neighbor searching algorithm with city block metric and equal, inverse or squared inverse weighting, K equal to 1 for all these three.
Table 5 Classification of dantrolene classes: responders (R), and semi-responders merged with non-responders (SNR).Because the accuracy of the test se-up for Table 4 is higher than that for Table 5, in other words, there are less incorrect predictions (19 + 8) in Table 4 than those (16 + 18) in Table 5, we continued to apply the fusion of responders and semi-responders. Note that the number of these signals was rather limited, thus, not the very best starting point for machine learning tasks.
We continued by classifying adrenaline vs. merged responders and semi-responders (RSR). Their results in Table 6 produced the accuracy of 71.2% and sensitivities of 70.3% for adrenaline (A) and 72.4% for RSR computed with least-squares support vector machines with the radial basis kernel with parameters C = 210 and σ = 22.
Table 6 Classification of adrenaline vs. dantrolene signal classes or responders merged with semi-responders (RSR).Next we computed adrenaline against non-responders shown in Table 7. This achieved the accuracy of 78.1% and sensitivities of 90.6% for adrenaline (A) and only 39.0% for non-responders (NR) given by Naïve Bayes with kernel density estimation and with the triangle kernel. The non-responders might suffer from the minority of 24% only when the majority of the adrenaline class was very predominant in classification. Nevertheless, the main reason of the low sensitivity of NR is that the non-responders resemble more or less the adrenaline signals. This is quite natural, because then dantrolene had not influence, in other words, it did not correct peak shapes in these NR signals.
Table 7 Classification of adrenaline vs. dantrolene class non-responders (NR).We also computed others such as merging semi-responders and non-responders and tested with that and also with three separate dantrolene classes against adrenaline, but these gave somewhat poorer results than the presented above.
We still compared the situation between baseline and adrenaline transient signals. Results are shown in Table 8 and were computed with least squares support vector machines with the radial basis kernel with parameters C = 2−10 and σ = 22. Accuracy is 54.7% and sensitivities are 57.0% for adrenaline signals (A) and 52.3% for baseline signals (B). These are rather low close to 50% indicating that A and B do not differ much from each other. This is in line with the result in our previous publication where we showed that CPVT-CMs demonstrated marked amount of calcium transient abnormalities both in baseline and in response to adrenaline.14
Table 8 Classification of baseline (B) vs. adrenaline (A)RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4