An independent and comprehensive evaluation of the CAZyme classifiers: - dbCAN2 - dbCAN3 - dbCAN4 - eCAMI - CUPP - HMMER (dbCAN4) - dbCAN-sub (dbCAN4) - DIAMOND (dbCAN4)

Evaluating performance of: - Binary CAZyme/non-CAZyme classification - Test set dependent performance of CAZyme/non-CAZyme classification - Binary classification per CAZy class - Multilabel CAZy class classification - Binary classification per CAZy family - Multilabel CAZy family classification

1 CAZyme classifier references and names

The CAZyme classifier dbCAN is available as a webserver and a standalone tool. In this evaluation the standalone tool was used, and is referred to as dbCAN, references to the webserver are defined as the dbCAN webserver. The version numbers of the standalone tool and the webserver are independent of one another: * The dbCAN2 webserver initally ran dbCAN version 2 (referred to as dbCAN2) * The dbCAN2 webserver than implememented the standalone dbCAN version 3 (referred to as dbCAN3) * The dbCAN3 webserver implements the standalone dbCAN version 4 (referred to as dbCAN4)

Each version of dbCAN implements multiple sequence alignment and modelling tools: * dbCAN2: * DIAMOND * HMMER * Hotpep * dbCAN3: * DIAMOND * HMMER * eCAMI * dbCAN4: * DIAMOND * HMMER * dbCAN-sub (implementation of HMMER)

All references to implementing HMMER and DIAMOND refer to the implementation of these tools by dbCAN. For this evaluation, specifically the implementation of HMMER and DIAMOND by dbCAN4.

dbCAN2 and dbCAN3

Han Zhang and others, dbCAN2: a meta server for automated carbohydrate-active enzyme annotation, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W95–W101

dbCAN4 and dbCAN-sub (dbCAN)

Zheng J, Ge Q, Yan Y, Zhang X, Huang L, Yin Y. dbCAN3: automated carbohydrate-active enzyme and substrate annotation. Nucleic Acids Res. 2023 Jul 5;51(W1):W115-W121

eCAMI

Xu J, Zhang H, Zheng J, Dovoedo P, Yin Y. eCAMI: simultaneous classification and motif identification for enzyme annotation. Bioinformatics. 2020 Apr 1;36(7):2068-2075

CUPP

Barrett, K., Lange, L. Peptide-based functional annotation of carbohydrate-active enzymes by conserved unique peptide patterns (CUPP). Biotechnol Biofuels 12, 102 (2019). https://doi.org/10.1186/s13068-019-1436-5

HMMER

Eddy SR. Profile hidden Markov models. Bioinformatics. 1998;14(9):755-63.

DIAMOND

Buchfink, B., Xie, C. & Huson, D. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12, 59–60 (2015). https://doi.org/10.1038/nmeth.3176

2 Introduction

CAZyme classifiers analyse query protein sequence and predict CAZyme domains and associated CAZy family annotations. This enables exploratory analysis of CAZyme complements not presently catalogued in the CAZy database (www.cazy.org). Each CAZyme classifier implements a different method to predict CAZy family annotations.

We previously published an evaluation of dbCAN2, eCAMI and CUPP (Hobbs et al., 2021). Since then, two standalone versions of dbCAN have been released (dbCAN3 and dbCAN4). Additionally, the previous analysis was limited to 70 genomes, over weighted towards bacterial genomes. To address these points, we present here an independent and comprehensive evaluation of the CAZyme classifiers: * dbCAN2 (v2.0.11) * dbCAN3 (v3.0.7) * dbCAN4 (v4.0.0) * eCAMI (implemented by dbCAN_3 (3.0.7)) * CUPP (v???) * HMMER (from dbCAN4) * dbCAN-sub (from dbCAN4) * DIAMOND (from dbCAN4)

Evaluating performance of: * Binary CAZyme/non-CAZyme classification * Test set dependent performance of CAZyme/non-CAZyme classification * Binary classification per CAZy class * Multilabel CAZy class classification * Binary classification per CAZy family * Multilabel CAZy family classification

Hobbs, Emma E. M.; Gloster, Tracey M.; Chapman, Sean; Pritchard, Leighton (2021). Microbiology Society Annual Conference 2021. figshare. Poster. https://doi.org/10.6084/m9.figshare.14370836.v3

3 Test sets

A single test set of 100 CAZymes and 100 non-CAZymes with the highest sequence similarity (rated by bit-score ratio) was created per genomic assembly selected to be included in the benchmark test set.

Choosing the 100 non-CAZymes with the highest sequence similarity was devised to increase the probability of causing confusion, to gather a better idea of the expected performance when using the classifiers. An equal number of CAZymes to non-CAZymes was selected to prevent over representation of one population over the other.

For inclusion of a genomic assembly for the creation of a test set, the assembly had to meet of all the following criteria:

  • Contains at least 100 CAZymes
  • Contains at least 100 non-CAZymes
  • Has an ‘Assembly level’ of ‘Complete Genome’ in the NCBI Assembly database
  • Protein records are still present in NCBI
  • Not listed as an ‘Anomalous assembly’ in the NCBI Assembly database

The genomic assemblies were also chosen from a range of taxonomies to provide as informative image of the performance of the classifiers over a range of datasets that users may wish to analyse.

We took the 70 test sets used in the previous evaluation (Hobbs et al., 2021), and added an additional 10 genomes.

## [1] "Mean percentage of genome incorporated in the CAZome across all test sets:"
## [1] 3.22
## [1] "Standard deviation of the percentage of genome incorporated in the CAZome across all test sets:"
## [1] 1.17
## [1] "Mean percentage of CAZomes incorporated in the test set across all genomes:"
## [1] 58.6
## [1] "Standard deviation of the percentage of CAZome incorporated in the test set across all genomes:"
## [1] 27.55
Histogram of CAZome coverage of the test sets for each respective source genomic assembly, overlayed by a box and whisker plot of the percentage of the CAZome incorproated in the test set.

Figure 3.1: Histogram of CAZome coverage of the test sets for each respective source genomic assembly, overlayed by a box and whisker plot of the percentage of the CAZome incorproated in the test set.

4 CAZyme/non-CAZyme classification

The assignment of CAZy family annotations by a CAZyme classifier identifies the protein as a CAZyme. If no CAZy family annotations are assigned to a protein by a CAZyme classifier, the tool identified the protein as a non-CAZyme. Here we evaluate the performance of each CAZyme classifier to differentiate between CAZymes and non-CAZymes (defined as proteins catalogued and not catalogued in CAZy respectively).

4.1 Summary statistics

For every classifier-test set pair, the specificity, sensitivity, prevision, F1-score and accuracy was calculated. The mean of each statistical parameter was calculated for each classifier across all tests, to represent the overall performance of each classifier. The 95% confidence interval (CI) was also calculated owing the tendancy of the mean to skew towards 1. These results are presented in table 4.1.

Table 4.1: Overall performance of CAZyme classifiers differentiation between CAZymes and non-CAZymes
Classifier Spec Mean Spec Standard Deviation Spec Lower CI Spec Upper CI Sens Mean Sens Standard Deviation Sens Lower CI Sens Upper CI Prec Mean Prec Standard Deviation Prec Lower CI Prec Upper CI F1-score Mean F1-score Standard Deviation F1-score Lower CI F1-score Upper CI Acc Mean Acc Standard Deviation Acc Lower CI Acc Upper CI
CUPP 0.9818 0.0487 0.9709 0.9926 0.8539 0.0744 0.8373 0.8704 0.9812 0.0454 0.9711 0.9913 0.9109 0.0520 0.8994 0.9225 0.9178 0.0460 0.9076 0.9280
dbCAN_2 0.9772 0.0526 0.9655 0.9890 0.9012 0.1082 0.8772 0.9253 0.9779 0.0453 0.9679 0.9880 0.9332 0.0789 0.9157 0.9508 0.9392 0.0596 0.9260 0.9525
dbCAN_2:DIAMOND 0.9744 0.0542 0.9623 0.9864 0.9158 0.1287 0.8871 0.9444 0.9758 0.0448 0.9659 0.9858 0.9380 0.0923 0.9175 0.9586 0.9451 0.0673 0.9301 0.9600
dbCAN_2:HMMER 0.9786 0.0517 0.9671 0.9901 0.8779 0.0807 0.8599 0.8958 0.9788 0.0446 0.9689 0.9887 0.9225 0.0639 0.9083 0.9367 0.9282 0.0473 0.9177 0.9388
dbCAN_2:Hotpep 0.9759 0.0459 0.9657 0.9861 0.8084 0.1268 0.7801 0.8366 0.9717 0.0535 0.9598 0.9836 0.8771 0.0916 0.8567 0.8975 0.8921 0.0714 0.8762 0.9080
dbCAN_3 0.9592 0.0768 0.9422 0.9763 0.9871 0.0436 0.9774 0.9968 0.9644 0.0556 0.9521 0.9768 0.9742 0.0410 0.9651 0.9833 0.9732 0.0436 0.9635 0.9829
dbCAN_3:DIAMOND 0.9665 0.0736 0.9501 0.9829 0.9689 0.0823 0.9506 0.9872 0.9706 0.0543 0.9586 0.9827 0.9664 0.0660 0.9517 0.9811 0.9677 0.0542 0.9556 0.9798
dbCAN_3:eCAMI 0.9752 0.0475 0.9647 0.9858 0.8516 0.1278 0.8232 0.8801 0.9738 0.0463 0.9635 0.9841 0.9026 0.0870 0.8832 0.9220 0.9134 0.0682 0.8983 0.9286
dbCAN_3:HMMER 0.9788 0.0520 0.9672 0.9903 0.8880 0.0818 0.8698 0.9062 0.9791 0.0448 0.9692 0.9891 0.9282 0.0646 0.9139 0.9426 0.9334 0.0481 0.9227 0.9441
dbCAN_4 0.9622 0.0763 0.9453 0.9792 0.9841 0.0640 0.9699 0.9984 0.9672 0.0555 0.9549 0.9796 0.9733 0.0531 0.9614 0.9851 0.9732 0.0491 0.9623 0.9841
dbCAN_4:dbCAN-sub 0.9775 0.0529 0.9657 0.9893 0.9498 0.0729 0.9335 0.9660 0.9792 0.0432 0.9696 0.9888 0.9618 0.0563 0.9493 0.9744 0.9636 0.0449 0.9536 0.9736
dbCAN_4:DIAMOND 0.9668 0.0744 0.9502 0.9833 0.9694 0.0835 0.9508 0.9880 0.9710 0.0545 0.9589 0.9832 0.9666 0.0694 0.9511 0.9820 0.9681 0.0547 0.9559 0.9802
dbCAN_4:HMMER 0.9788 0.0520 0.9672 0.9903 0.8900 0.0819 0.8718 0.9082 0.9792 0.0447 0.9692 0.9891 0.9293 0.0648 0.9149 0.9438 0.9344 0.0483 0.9236 0.9451

The 95% CI was plotted as error bars around the mean CI (figure 4.1.

Summary statistics of CAZyme classifiers performances of binary CAZyme/non-CAZyme prediction. The mean plus and minus the 95% confidence interval.

Figure 4.1: Summary statistics of CAZyme classifiers performances of binary CAZyme/non-CAZyme prediction. The mean plus and minus the 95% confidence interval.

4.2 Specificity

Specificity is the proportion of known negatives (known non-CAZymes) which are correctly classified as negatives (non-CAZymes). Figure 4.2 is a graphical representation of the results calculated in table 4.1.

One-dimensional scatter plot of specificity scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

Figure 4.2: One-dimensional scatter plot of specificity scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

4.3 Sensitivity

Sensitivity (also known as recall) is the proportion of known positives (CAZymes) that are correctly identified as positives (CAZymes). Figure 4.3 graphically represents of the results calculated in table 4.1.

One-dimensional scatter plot of recall (sensitivity) scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

Figure 4.3: One-dimensional scatter plot of recall (sensitivity) scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

4.4 Precision

Precision is the proportion of positive predictions by the classifiers that are correct. In this case, precision represents the fraction of CAZyme predictions by the classifiers that are correct, specifically the proportion of predicted CAZymes that are known CAZymes. Figure 4.4 is a visual representation of the results calculated in table 4.1.

One-dimensional scatter plot of precision scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

Figure 4.4: One-dimensional scatter plot of precision scores of CAZyme and non-CAZyme predictions per test set, overlaying box plot of standard deviation.

4.5 F1-score

The F1-score is a harmonic (or weighted) average of recall and precision and provides an idea of the overall performance of the tool, 0 being the lowest and 1 being the best performance. Figure 4.5 shows the F1-score from each test set, for each classifier.

Bar chart of specificity of CAZyme classifiers differentiation between CAZymes and non-CAZymes.

Figure 4.5: Bar chart of specificity of CAZyme classifiers differentiation between CAZymes and non-CAZymes.

4.6 Accuracy

Accuarcy (calculated using (TP + TN) / (TP + TN + FP + FN) ) provides an idea of the overall performance of the classifiers as a measure of the degree to which their CAZyme/non-CAZyme predictions conforms to the correct result. Figure 4.6 is a plot of respective data from table 4.1.

Bar chart of specificity of CAZyme classifiers differentiation between CAZymes and non-CAZymes.

Figure 4.6: Bar chart of specificity of CAZyme classifiers differentiation between CAZymes and non-CAZymes.

4.7 Combined statistics plot

Here we generate a plot that combines the plots from above into a single figure.

Box and whisker plots of the performance of CAZyme/non-CAZyme classification.

Figure 4.7: Box and whisker plots of the performance of CAZyme/non-CAZyme classification.

4.8 ROC curve - Receiver Operator Characteristic curve

The Receiver Operator Characteristic (ROC) curve (in figure 4.8) enables us to compare sensitivity to specificity but plotting sensitivity versus 1-specificity.

Receiver Operator Characteristic (ROC) curve of CAZyme/non-CAZyme classification.

Figure 4.8: Receiver Operator Characteristic (ROC) curve of CAZyme/non-CAZyme classification.

4.9 Expected Range of Accuracy

The statistics evaluated above provide an idea of the general performance of the tools, but they do not provide an idea of the expect range of performance. Specifically, the data does not provide a clear image of the best and worse performance a user can expect when using these tools.

To compare the expected typical range in accuracies for each classifier, 6 test sets (identified by the source genomic assemblies) were selected at random. The CAZyme/non-CAZyme predictions for each classifier, for each test set, were bootstrap resampled 100 times each, and for each bootstrap sample the accuracy calculated. The accuracies of the bootstrap samples for each classifier were plotted on stacked histograms, shown in figure 4.9.

Stacked histograms of bootstrap sample accuracies of CAZyme classifiers' differentiation between CAZymes and non-CAZymes. 6 test sets (identified by their source genomic assembly) were selected at random. The CAZyme/non-CAZyme predictions for each classifier, for each test set, were bootstrap resampled 100 times. The accuracy of each of the 600 bootstrap samples per test set were plotted as a stacked histogram.

Figure 4.9: Stacked histograms of bootstrap sample accuracies of CAZyme classifiers’ differentiation between CAZymes and non-CAZymes. 6 test sets (identified by their source genomic assembly) were selected at random. The CAZyme/non-CAZyme predictions for each classifier, for each test set, were bootstrap resampled 100 times. The accuracy of each of the 600 bootstrap samples per test set were plotted as a stacked histogram.

4.10 Conclusions on the Binary CAZyme/non-CAZyme Prediction Performance

Overall, all tools showed a low probability of producing false positives (missclassifying a non-CAZyme as a CAZyme), and few of the positive predictions are false positives. Therefore, we can be confident in that the CAZyme predictions made by each of these tools are most likely correct. However, all the classifiers demonstrated a consistent behaviour to not identify all CAZymes within a CAZome. Therefore, we can be confident in the CAZyme predictions, but should not presume all non-CAZyme predictions are correct; these classifiers are unlikely to identify the complete CAZome although a near-complete CAZome will be accurately identified.

5 CAZyme/non-CAZyme classification: Taxonomic evaluation

The performance for a classifier per taxonomy group may vary. For this evaluation the test sets were separated into the taxonomy groups: - Bacteria - Eukaryote

The evaluation per classifier per taxonomy group, versus all test sets pooled together was evaluated.

5.1 Specificity

Table 5.1: The specificity of binary CAZyme/non-CAZyme classification by CAZy classifiers per taxonomy group
Prediction_tool Bact Mean Bact Standard Deviation Bact Lower CI Bact Upper CI Euk Mean Euk Standard Deviation Euk Lower CI Euk Upper CI All Mean All Standard Deviation All Lower CI All Upper CI
CUPP 0.9888 0.0276 0.9799 0.9976 0.9748 0.0628 0.9547 0.9948 0.9818 0.0486 0.9742 0.9893
dbCAN_2 0.9828 0.0349 0.9716 0.9939 0.9718 0.0658 0.9507 0.9928 0.9772 0.0525 0.9691 0.9854
dbCAN_2:DIAMOND 0.9790 0.0352 0.9677 0.9903 0.9698 0.0684 0.9479 0.9916 0.9744 0.0541 0.9659 0.9828
dbCAN_2:HMMER 0.9858 0.0286 0.9766 0.9949 0.9715 0.0670 0.9501 0.9929 0.9786 0.0515 0.9706 0.9867
dbCAN_2:Hotpep 0.9788 0.0344 0.9677 0.9898 0.9730 0.0553 0.9553 0.9907 0.9759 0.0457 0.9687 0.9830
dbCAN_3 0.9685 0.0402 0.9557 0.9813 0.9500 0.1007 0.9178 0.9822 0.9592 0.0765 0.9473 0.9712
dbCAN_3:DIAMOND 0.9785 0.0363 0.9669 0.9901 0.9545 0.0967 0.9236 0.9854 0.9665 0.0734 0.9550 0.9780
dbCAN_3:HMMER 0.9868 0.0284 0.9777 0.9958 0.9708 0.0673 0.9492 0.9923 0.9788 0.0518 0.9707 0.9868
dbCAN_3:eCAMI 0.9772 0.0351 0.9660 0.9885 0.9732 0.0578 0.9548 0.9917 0.9752 0.0474 0.9678 0.9827
dbCAN_4 0.9740 0.0383 0.9618 0.9862 0.9505 0.1003 0.9184 0.9826 0.9622 0.0761 0.9504 0.9741
dbCAN_4:DIAMOND 0.9798 0.0355 0.9684 0.9911 0.9538 0.0980 0.9224 0.9851 0.9668 0.0742 0.9552 0.9783
dbCAN_4:HMMER 0.9868 0.0284 0.9777 0.9958 0.9708 0.0673 0.9492 0.9923 0.9788 0.0518 0.9707 0.9868
dbCAN_4:dbCAN-sub 0.9868 0.0270 0.9781 0.9954 0.9682 0.0690 0.9462 0.9903 0.9775 0.0527 0.9693 0.9857

5.2 Sensitivity

Table 5.2: The sensitivity of binary CAZyme/non-CAZyme classification by CAZy classifiers per taxonomy group
Prediction_tool Bact Mean Bact Standard Deviation Bact Lower CI Bact Upper CI Euk Mean Euk Standard Deviation Euk Lower CI Euk Upper CI All Mean All Standard Deviation All Lower CI All Upper CI
CUPP 0.8618 0.0788 0.8366 0.8869 0.8460 0.0699 0.8236 0.8684 0.8539 0.0742 0.8423 0.8655
dbCAN_2 0.9052 0.1144 0.8686 0.9419 0.8972 0.1029 0.8643 0.9302 0.9012 0.1079 0.8844 0.9181
dbCAN_2:DIAMOND 0.9182 0.1346 0.8752 0.9613 0.9132 0.1241 0.8736 0.9529 0.9158 0.1283 0.8957 0.9358
dbCAN_2:HMMER 0.8695 0.1078 0.8350 0.9040 0.8862 0.0379 0.8741 0.8984 0.8779 0.0804 0.8653 0.8904
dbCAN_2:Hotpep 0.8195 0.1164 0.7823 0.8567 0.7972 0.1371 0.7534 0.8411 0.8084 0.1264 0.7886 0.8281
dbCAN_3 0.9832 0.0605 0.9639 1.0026 0.9910 0.0130 0.9869 0.9951 0.9871 0.0435 0.9803 0.9939
dbCAN_3:DIAMOND 0.9560 0.1088 0.9212 0.9908 0.9818 0.0392 0.9692 0.9943 0.9689 0.0820 0.9561 0.9817
dbCAN_3:HMMER 0.8728 0.1088 0.8380 0.9075 0.9032 0.0353 0.8920 0.9145 0.8880 0.0815 0.8753 0.9007
dbCAN_3:eCAMI 0.8832 0.1258 0.8430 0.9235 0.8200 0.1232 0.7806 0.8594 0.8516 0.1273 0.8317 0.8715
dbCAN_4 0.9748 0.0897 0.9461 1.0034 0.9935 0.0083 0.9908 0.9962 0.9841 0.0638 0.9742 0.9941
dbCAN_4:DIAMOND 0.9498 0.1149 0.9130 0.9865 0.9890 0.0110 0.9855 0.9925 0.9694 0.0832 0.9564 0.9824
dbCAN_4:HMMER 0.8765 0.1097 0.8414 0.9116 0.9035 0.0342 0.8926 0.9144 0.8900 0.0816 0.8773 0.9027
dbCAN_4:dbCAN-sub 0.9520 0.1011 0.9197 0.9843 0.9475 0.0228 0.9402 0.9548 0.9498 0.0726 0.9384 0.9611

5.3 Precision

Table 5.3: The precision of binary CAZyme/non-CAZyme classification by CAZy classifiers per taxonomy group
Prediction_tool Bact Mean Bact Standard Deviation Bact Lower CI Bact Upper CI Euk Mean Euk Standard Deviation Euk Lower CI Euk Upper CI All Mean All Standard Deviation All Lower CI All Upper CI
CUPP 0.9877 0.0280 0.9787 0.9966 0.9748 0.0574 0.9564 0.9932 0.9812 0.0452 0.9742 0.9883
dbCAN_2 0.9824 0.0348 0.9713 0.9936 0.9734 0.0540 0.9562 0.9907 0.9779 0.0452 0.9709 0.9850
dbCAN_2:DIAMOND 0.9791 0.0348 0.9680 0.9902 0.9726 0.0533 0.9555 0.9896 0.9758 0.0447 0.9689 0.9828
dbCAN_2:HMMER 0.9847 0.0299 0.9751 0.9943 0.9729 0.0553 0.9552 0.9906 0.9788 0.0445 0.9719 0.9857
dbCAN_2:Hotpep 0.9759 0.0381 0.9637 0.9881 0.9676 0.0656 0.9466 0.9885 0.9717 0.0533 0.9634 0.9801
dbCAN_3 0.9704 0.0357 0.9590 0.9818 0.9585 0.0702 0.9361 0.9810 0.9644 0.0555 0.9558 0.9731
dbCAN_3:DIAMOND 0.9795 0.0332 0.9689 0.9901 0.9618 0.0686 0.9399 0.9837 0.9706 0.0541 0.9622 0.9791
dbCAN_3:HMMER 0.9858 0.0297 0.9763 0.9953 0.9725 0.0556 0.9547 0.9903 0.9791 0.0446 0.9722 0.9861
dbCAN_3:eCAMI 0.9764 0.0349 0.9653 0.9876 0.9713 0.0557 0.9534 0.9891 0.9738 0.0461 0.9666 0.9810
dbCAN_4 0.9755 0.0346 0.9645 0.9866 0.9589 0.0700 0.9365 0.9813 0.9672 0.0553 0.9586 0.9759
dbCAN_4:DIAMOND 0.9805 0.0327 0.9701 0.9910 0.9615 0.0690 0.9394 0.9836 0.9710 0.0544 0.9625 0.9795
dbCAN_4:HMMER 0.9858 0.0297 0.9763 0.9953 0.9725 0.0556 0.9548 0.9903 0.9792 0.0446 0.9722 0.9861
dbCAN_4:dbCAN-sub 0.9869 0.0258 0.9786 0.9951 0.9715 0.0547 0.9540 0.9889 0.9792 0.0431 0.9724 0.9859

5.4 F1-score

Table 5.4: The F1-score of binary CAZyme/non-CAZyme classification by CAZy classifiers per taxonomy group
Prediction_tool Bact Mean Bact Standard Deviation Bact Lower CI Bact Upper CI Euk Mean Euk Standard Deviation Euk Lower CI Euk Upper CI All Mean All Standard Deviation All Lower CI All Upper CI
CUPP 0.9184 0.0518 0.9018 0.9349 0.9035 0.0519 0.8869 0.9201 0.9109 0.0519 0.9028 0.9190
dbCAN_2 0.9373 0.0837 0.9105 0.9640 0.9292 0.0746 0.9053 0.9530 0.9332 0.0786 0.9209 0.9455
dbCAN_2:DIAMOND 0.9406 0.0985 0.9091 0.9722 0.9354 0.0868 0.9076 0.9632 0.9380 0.0920 0.9237 0.9524
dbCAN_2:HMMER 0.9188 0.0839 0.8919 0.9456 0.9263 0.0346 0.9152 0.9373 0.9225 0.0637 0.9126 0.9324
dbCAN_2:Hotpep 0.8861 0.0757 0.8619 0.9103 0.8680 0.1053 0.8344 0.9017 0.8771 0.0913 0.8628 0.8913
dbCAN_3 0.9753 0.0406 0.9623 0.9883 0.9730 0.0418 0.9596 0.9864 0.9742 0.0408 0.9678 0.9805
dbCAN_3:DIAMOND 0.9628 0.0818 0.9366 0.9889 0.9700 0.0459 0.9553 0.9847 0.9664 0.0658 0.9561 0.9767
dbCAN_3:HMMER 0.9210 0.0845 0.8940 0.9481 0.9354 0.0347 0.9243 0.9465 0.9282 0.0644 0.9182 0.9383
dbCAN_3:eCAMI 0.9219 0.0802 0.8962 0.9476 0.8833 0.0903 0.8544 0.9121 0.9026 0.0868 0.8890 0.9161
dbCAN_4 0.9720 0.0627 0.9519 0.9920 0.9745 0.0421 0.9611 0.9880 0.9733 0.0529 0.9650 0.9815
dbCAN_4:DIAMOND 0.9594 0.0890 0.9310 0.9879 0.9737 0.0415 0.9604 0.9870 0.9666 0.0691 0.9558 0.9774
dbCAN_4:HMMER 0.9231 0.0851 0.8959 0.9503 0.9356 0.0343 0.9246 0.9465 0.9293 0.0646 0.9193 0.9394
dbCAN_4:dbCAN-sub 0.9654 0.0738 0.9418 0.9889 0.9583 0.0308 0.9484 0.9682 0.9618 0.0561 0.9531 0.9706

5.5 Accuracy

Table 5.5: The accuracy of binary CAZyme/non-CAZyme classification by CAZy classifiers per taxonomy group
Prediction_tool Bact Mean Bact Standard Deviation Bact Lower CI Bact Upper CI Euk Mean Euk Standard Deviation Euk Lower CI Euk Upper CI All Mean All Standard Deviation All Lower CI All Upper CI
CUPP 0.9252 0.0430 0.9115 0.9390 0.9104 0.0481 0.8950 0.9258 0.9178 0.0458 0.9107 0.9250
dbCAN_2 0.9440 0.0600 0.9248 0.9632 0.9345 0.0594 0.9155 0.9535 0.9392 0.0594 0.9300 0.9485
dbCAN_2:DIAMOND 0.9486 0.0687 0.9267 0.9706 0.9415 0.0666 0.9202 0.9628 0.9451 0.0671 0.9346 0.9555
dbCAN_2:HMMER 0.9276 0.0561 0.9097 0.9456 0.9289 0.0372 0.9170 0.9408 0.9282 0.0472 0.9209 0.9356
dbCAN_2:Hotpep 0.8991 0.0603 0.8799 0.9184 0.8851 0.0813 0.8591 0.9111 0.8921 0.0712 0.8810 0.9032
dbCAN_3 0.9759 0.0354 0.9645 0.9872 0.9705 0.0509 0.9542 0.9868 0.9732 0.0435 0.9664 0.9800
dbCAN_3:DIAMOND 0.9672 0.0561 0.9493 0.9852 0.9681 0.0531 0.9512 0.9851 0.9677 0.0541 0.9592 0.9761
dbCAN_3:HMMER 0.9298 0.0568 0.9116 0.9479 0.9370 0.0378 0.9249 0.9491 0.9334 0.0479 0.9259 0.9409
dbCAN_3:eCAMI 0.9302 0.0635 0.9099 0.9506 0.8966 0.0693 0.8745 0.9188 0.9134 0.0680 0.9028 0.9241
dbCAN_4 0.9744 0.0474 0.9592 0.9895 0.9720 0.0513 0.9556 0.9884 0.9732 0.0489 0.9655 0.9808
dbCAN_4:DIAMOND 0.9648 0.0591 0.9459 0.9836 0.9714 0.0504 0.9552 0.9875 0.9681 0.0545 0.9596 0.9766
dbCAN_4:HMMER 0.9316 0.0574 0.9133 0.9500 0.9371 0.0375 0.9251 0.9491 0.9344 0.0481 0.9269 0.9419
dbCAN_4:dbCAN-sub 0.9694 0.0528 0.9525 0.9863 0.9579 0.0351 0.9466 0.9691 0.9636 0.0448 0.9566 0.9706

6 CAZy class classification

CAZy groups CAZymes into CAZy families by sequence similarity, and CAZy families are grouped into one of 6 functional classes. The CAZyme classifiers predict the CAZy family annotations of predicted CAZymes, but it is of interest to see what the level of performance of the classiferis is at the CAZy class level. Specifically, a classifier may struggle to predict the correct CAZy class for a CAZyme but consistently predict the correct CAZy class. Therefore, the aim of this part of the evaluation is to evaluate the performance of the classifiers to predict the correct CAZy class of predict CAZymes.

6.2 Performance per CAZy class

Below the prediction sensitivity is plotted against the specificity for each classifier, and a separate plot is generated for each CAZy class.

The scatter plots of sensitivity against specificity overlay a coloured contour to highlight the distribution of the points. When too many points have the same value a contour cannot be generated. In order to plot a contour noise is added to the data. The original data is used to plot the scatter plot and the data with added noise is used to plot the contour.

The percentage of the data points which need noise to be added to them in order to generate a contour varies from data set to data set. To change the percentage of the data points with noise added to them, change the third value of call to the function plot.class.sens.vs.spec(), which is used to generate the plots. The third value is the percentage of data points to add noise to, written in decimal form.

6.2.1 GH class classification

Table 6.3: Overall performance of CAZyme classifiers classification of GH class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 0.9933 0.0212 0.9886 0.9980 0.9080 0.0675 0.8930 0.9230 0.9906 0.0338 0.9831 0.9981 0.9461 0.0454 0.9360 0.9562 0.9540 0.0293 0.9475 0.9605
dbCAN_2 0.9917 0.0245 0.9862 0.9972 0.9354 0.0907 0.9152 0.9556 0.9886 0.0347 0.9809 0.9963 0.9583 0.0614 0.9446 0.9719 0.9658 0.0361 0.9578 0.9738
dbCAN_2:DIAMOND 0.9862 0.0276 0.9801 0.9924 0.9379 0.1064 0.9142 0.9615 0.9845 0.0360 0.9765 0.9926 0.9563 0.0727 0.9401 0.9725 0.9644 0.0415 0.9551 0.9736
dbCAN_2:HMMER 0.9927 0.0242 0.9873 0.9981 0.9080 0.0824 0.8896 0.9263 0.9897 0.0339 0.9821 0.9972 0.9444 0.0587 0.9313 0.9574 0.9536 0.0330 0.9463 0.9609
dbCAN_2:Hotpep 0.9833 0.0286 0.9769 0.9897 0.8641 0.1181 0.8378 0.8904 0.9796 0.0419 0.9702 0.9889 0.9136 0.0773 0.8964 0.9308 0.9297 0.0495 0.9187 0.9407
dbCAN_3 0.9923 0.0228 0.9873 0.9974 0.9567 0.0771 0.9396 0.9739 0.9897 0.0332 0.9823 0.9971 0.9708 0.0558 0.9583 0.9832 0.9768 0.0304 0.9701 0.9836
dbCAN_3:DIAMOND 0.9844 0.0292 0.9779 0.9909 0.9760 0.0673 0.9610 0.9910 0.9830 0.0391 0.9743 0.9917 0.9776 0.0491 0.9666 0.9885 0.9812 0.0295 0.9747 0.9878
dbCAN_3:eCAMI 0.9848 0.0295 0.9782 0.9913 0.8764 0.1097 0.8520 0.9008 0.9829 0.0395 0.9741 0.9917 0.9223 0.0695 0.9069 0.9378 0.9364 0.0485 0.9256 0.9472
dbCAN_3:HMMER 0.9939 0.0224 0.9889 0.9988 0.9198 0.0830 0.9013 0.9383 0.9909 0.0330 0.9836 0.9983 0.9514 0.0596 0.9382 0.9647 0.9599 0.0331 0.9525 0.9672
dbCAN_4 0.9927 0.0240 0.9873 0.9980 0.9500 0.0802 0.9322 0.9679 0.9898 0.0337 0.9823 0.9973 0.9671 0.0590 0.9539 0.9802 0.9743 0.0307 0.9675 0.9812
dbCAN_4:dbCAN-sub 0.9924 0.0238 0.9871 0.9977 0.9473 0.0789 0.9297 0.9648 0.9896 0.0337 0.9821 0.9971 0.9655 0.0581 0.9526 0.9785 0.9731 0.0297 0.9665 0.9797
dbCAN_4:DIAMOND 0.9864 0.0279 0.9802 0.9927 0.9737 0.0763 0.9567 0.9906 0.9842 0.0393 0.9754 0.9929 0.9764 0.0583 0.9634 0.9894 0.9812 0.0304 0.9745 0.9880
dbCAN_4:HMMER 0.9939 0.0224 0.9889 0.9988 0.9202 0.0828 0.9017 0.9386 0.9909 0.0330 0.9836 0.9983 0.9516 0.0596 0.9384 0.9649 0.9601 0.0328 0.9528 0.9674
Scatter plot of sensitivity against specificity for predicting GH CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.5: Scatter plot of sensitivity against specificity for predicting GH CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of GH class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.6: Summary statistics of CAZyme classifiers performances of GH class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of GH class members, overlaying a box plot

Figure 6.7: One dimensional scatter plot of the statistical parameters per test set for the classification of GH class members, overlaying a box plot

6.2.2 GT class classification

Table 6.4: Overall performance of CAZyme classifiers classification of GT class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 0.9921 0.0463 0.9818 1.0024 0.8536 0.1107 0.8289 0.8782 0.9883 0.0625 0.9744 1.0022 0.9107 0.0758 0.8938 0.9276 0.9462 0.0579 0.9333 0.9591
dbCAN_2 0.9927 0.0454 0.9826 1.0028 0.8845 0.1378 0.8538 0.9152 0.9898 0.0542 0.9778 1.0019 0.9258 0.1016 0.9032 0.9484 0.9550 0.0770 0.9379 0.9721
dbCAN_2:DIAMOND 0.9919 0.0460 0.9817 1.0021 0.9255 0.1508 0.8919 0.9591 0.9886 0.0567 0.9760 1.0013 0.9463 0.1124 0.9213 0.9713 0.9672 0.0821 0.9489 0.9854
dbCAN_2:HMMER 0.9904 0.0487 0.9796 1.0012 0.8627 0.1126 0.8376 0.8877 0.9884 0.0566 0.9758 1.0010 0.9152 0.0845 0.8964 0.9340 0.9498 0.0598 0.9365 0.9631
dbCAN_2:Hotpep 0.9924 0.0421 0.9830 1.0017 0.7254 0.1807 0.6852 0.7656 0.9836 0.0688 0.9683 0.9989 0.8209 0.1383 0.7901 0.8517 0.9031 0.0908 0.8829 0.9233
dbCAN_3 0.9914 0.0474 0.9808 1.0019 0.9421 0.0971 0.9205 0.9637 0.9891 0.0563 0.9766 1.0016 0.9606 0.0790 0.9430 0.9782 0.9751 0.0585 0.9621 0.9881
dbCAN_3:DIAMOND 0.9893 0.0486 0.9784 1.0001 0.9774 0.0897 0.9574 0.9973 0.9839 0.0620 0.9701 0.9977 0.9764 0.0755 0.9596 0.9932 0.9848 0.0567 0.9722 0.9974
dbCAN_3:eCAMI 0.9922 0.0439 0.9824 1.0019 0.8500 0.1524 0.8161 0.8839 0.9881 0.0572 0.9754 1.0009 0.9046 0.1076 0.8806 0.9285 0.9413 0.0822 0.9230 0.9596
dbCAN_3:HMMER 0.9904 0.0487 0.9796 1.0012 0.8654 0.1113 0.8406 0.8901 0.9884 0.0566 0.9758 1.0010 0.9169 0.0839 0.8982 0.9355 0.9503 0.0597 0.9370 0.9635
dbCAN_4 0.9900 0.0492 0.9790 1.0009 0.9578 0.0921 0.9373 0.9783 0.9866 0.0566 0.9740 0.9992 0.9677 0.0753 0.9510 0.9845 0.9781 0.0572 0.9654 0.9909
dbCAN_4:dbCAN-sub 0.9900 0.0492 0.9790 1.0009 0.9538 0.0835 0.9352 0.9724 0.9866 0.0566 0.9740 0.9992 0.9664 0.0656 0.9518 0.9810 0.9773 0.0519 0.9657 0.9889
dbCAN_4:DIAMOND 0.9893 0.0488 0.9785 1.0002 0.9751 0.0914 0.9548 0.9954 0.9841 0.0638 0.9699 0.9983 0.9750 0.0781 0.9577 0.9924 0.9838 0.0577 0.9710 0.9966
dbCAN_4:HMMER 0.9904 0.0487 0.9796 1.0012 0.8657 0.1110 0.8410 0.8904 0.9884 0.0566 0.9758 1.0010 0.9170 0.0838 0.8984 0.9357 0.9503 0.0597 0.9370 0.9635
Scatter plot of sensitivity against specificity for predicting GT CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.8: Scatter plot of sensitivity against specificity for predicting GT CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of GT class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.9: Summary statistics of CAZyme classifiers performances of GT class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of GT class members, overlaying a box plot

Figure 6.10: One dimensional scatter plot of the statistical parameters per test set for the classification of GT class members, overlaying a box plot

6.2.3 PL class classification

Table 6.5: Overall performance of CAZyme classifiers classification of PL class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 0.9996 0.0019 0.9990 1.0002 0.8511 0.2593 0.7749 0.9272 0.9496 0.2058 0.8892 1.0101 0.8850 0.2288 0.8178 0.9522 0.9941 0.0096 0.9913 0.9970
dbCAN_2 0.9998 0.0012 0.9995 1.0002 0.8797 0.2421 0.8086 0.9508 0.9532 0.2052 0.8929 1.0134 0.9073 0.2183 0.8432 0.9714 0.9959 0.0070 0.9938 0.9979
dbCAN_2:DIAMOND 0.9996 0.0019 0.9991 1.0002 0.8691 0.2687 0.7902 0.9480 0.9248 0.2505 0.8513 0.9984 0.8889 0.2530 0.8146 0.9632 0.9954 0.0073 0.9933 0.9976
dbCAN_2:HMMER 0.9998 0.0012 0.9995 1.0002 0.8975 0.2125 0.8351 0.9598 0.9745 0.1481 0.9310 1.0180 0.9250 0.1788 0.8725 0.9775 0.9963 0.0062 0.9944 0.9981
dbCAN_2:Hotpep 0.9993 0.0027 0.9985 1.0001 0.8407 0.2581 0.7650 0.9165 0.9506 0.2050 0.8904 1.0108 0.8803 0.2257 0.8140 0.9465 0.9929 0.0131 0.9891 0.9968
dbCAN_3 0.9994 0.0025 0.9986 1.0001 0.9881 0.0732 0.9666 1.0096 0.9846 0.0766 0.9621 1.0071 0.9826 0.0689 0.9624 1.0028 0.9990 0.0030 0.9981 0.9999
dbCAN_3:DIAMOND 0.9989 0.0039 0.9978 1.0000 0.9881 0.0732 0.9666 1.0096 0.9730 0.1012 0.9433 1.0028 0.9754 0.0792 0.9521 0.9986 0.9985 0.0041 0.9973 0.9997
dbCAN_3:eCAMI 0.9996 0.0018 0.9991 1.0002 0.7960 0.2776 0.7154 0.8766 0.9333 0.2452 0.8621 1.0045 0.8473 0.2533 0.7737 0.9208 0.9920 0.0129 0.9883 0.9958
dbCAN_3:HMMER 0.9994 0.0025 0.9986 1.0001 0.9739 0.0982 0.9451 1.0028 0.9846 0.0766 0.9621 1.0071 0.9741 0.0781 0.9511 0.9970 0.9986 0.0035 0.9975 0.9996
dbCAN_4 0.9994 0.0025 0.9986 1.0001 0.9739 0.0982 0.9451 1.0028 0.9846 0.0766 0.9621 1.0071 0.9741 0.0781 0.9511 0.9970 0.9986 0.0035 0.9975 0.9996
dbCAN_4:dbCAN-sub 0.9994 0.0025 0.9986 1.0001 0.9752 0.0982 0.9464 1.0040 0.9846 0.0766 0.9621 1.0071 0.9747 0.0782 0.9518 0.9977 0.9988 0.0032 0.9978 0.9997
dbCAN_4:DIAMOND 0.9989 0.0039 0.9978 1.0000 0.9987 0.0086 0.9962 1.0013 0.9730 0.1012 0.9433 1.0028 0.9825 0.0646 0.9635 1.0014 0.9988 0.0039 0.9976 0.9999
dbCAN_4:HMMER 0.9994 0.0025 0.9986 1.0001 0.9739 0.0982 0.9451 1.0028 0.9846 0.0766 0.9621 1.0071 0.9741 0.0781 0.9511 0.9970 0.9986 0.0035 0.9975 0.9996
Scatter plot of sensitivity against specificity for predicting PL CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.11: Scatter plot of sensitivity against specificity for predicting PL CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of PL class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.12: Summary statistics of CAZyme classifiers performances of PL class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of PL class members, overlaying a box plot

Figure 6.13: One dimensional scatter plot of the statistical parameters per test set for the classification of PL class members, overlaying a box plot

6.2.4 CE class classification

Table 6.6: Overall performance of CAZyme classifiers classification of CE class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 0.9955 0.0178 0.9914 0.9995 0.9114 0.1332 0.8810 0.9419 0.9598 0.1130 0.9339 0.9856 0.9250 0.1095 0.9000 0.9500 0.9900 0.0187 0.9857 0.9943
dbCAN_2 0.9937 0.0220 0.9887 0.9988 0.9213 0.1433 0.8886 0.9540 0.9519 0.1408 0.9197 0.9840 0.9224 0.1305 0.8926 0.9522 0.9893 0.0225 0.9842 0.9945
dbCAN_2:DIAMOND 0.9941 0.0210 0.9893 0.9989 0.8480 0.2469 0.7915 0.9044 0.9280 0.2089 0.8803 0.9757 0.8636 0.2234 0.8126 0.9147 0.9862 0.0242 0.9807 0.9918
dbCAN_2:HMMER 0.9945 0.0187 0.9902 0.9988 0.9208 0.1363 0.8896 0.9519 0.9525 0.1191 0.9253 0.9797 0.9248 0.1119 0.8993 0.9504 0.9901 0.0189 0.9858 0.9945
dbCAN_2:Hotpep 0.9905 0.0229 0.9852 0.9957 0.8508 0.2065 0.8036 0.8980 0.9128 0.1671 0.8746 0.9510 0.8555 0.1705 0.8165 0.8944 0.9821 0.0251 0.9764 0.9878
dbCAN_3 0.9936 0.0224 0.9885 0.9988 0.9283 0.1537 0.8932 0.9634 0.9526 0.1391 0.9208 0.9844 0.9236 0.1397 0.8916 0.9555 0.9896 0.0220 0.9846 0.9946
dbCAN_3:DIAMOND 0.9925 0.0234 0.9872 0.9979 0.9303 0.1704 0.8913 0.9692 0.9304 0.1864 0.8878 0.9730 0.9152 0.1682 0.8768 0.9537 0.9890 0.0231 0.9837 0.9943
dbCAN_3:eCAMI 0.9926 0.0213 0.9877 0.9975 0.8073 0.2445 0.7514 0.8632 0.9156 0.1909 0.8719 0.9592 0.8314 0.2020 0.7852 0.8775 0.9818 0.0249 0.9761 0.9875
dbCAN_3:HMMER 0.9953 0.0179 0.9912 0.9994 0.9230 0.1335 0.8925 0.9535 0.9602 0.1103 0.9350 0.9854 0.9302 0.1073 0.9057 0.9547 0.9910 0.0177 0.9870 0.9951
dbCAN_4 0.9953 0.0179 0.9912 0.9994 0.9783 0.0669 0.9630 0.9935 0.9603 0.1103 0.9351 0.9855 0.9638 0.0854 0.9443 0.9833 0.9943 0.0172 0.9904 0.9982
dbCAN_4:dbCAN-sub 0.9952 0.0181 0.9911 0.9993 0.9755 0.0680 0.9599 0.9910 0.9592 0.1120 0.9336 0.9848 0.9617 0.0863 0.9420 0.9814 0.9938 0.0175 0.9898 0.9978
dbCAN_4:DIAMOND 0.9931 0.0227 0.9879 0.9983 0.9755 0.0851 0.9560 0.9949 0.9476 0.1506 0.9132 0.9820 0.9507 0.1192 0.9234 0.9779 0.9919 0.0218 0.9870 0.9969
dbCAN_4:HMMER 0.9953 0.0179 0.9912 0.9994 0.9529 0.1012 0.9298 0.9760 0.9603 0.1103 0.9351 0.9855 0.9487 0.0957 0.9268 0.9706 0.9929 0.0173 0.9889 0.9968
Scatter plot of sensitivity against specificity for predicting CE CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.14: Scatter plot of sensitivity against specificity for predicting CE CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of CE class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.15: Summary statistics of CAZyme classifiers performances of CE class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of CE class members, overlaying a box plot

Figure 6.16: One dimensional scatter plot of the statistical parameters per test set for the classification of CE class members, overlaying a box plot

6.2.5 AA class classification

Table 6.7: Overall performance of CAZyme classifiers classification of AA class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 0.9919 0.0175 0.9868 0.9970 0.9047 0.1300 0.8670 0.9425 0.9184 0.1697 0.8691 0.9677 0.9011 0.1376 0.8611 0.9410 0.9841 0.0215 0.9778 0.9903
dbCAN_2 0.9919 0.0178 0.9868 0.9971 0.9190 0.1152 0.8855 0.9524 0.9208 0.1693 0.8716 0.9699 0.9104 0.1322 0.8720 0.9488 0.9856 0.0204 0.9796 0.9915
dbCAN_2:DIAMOND 0.9922 0.0178 0.9870 0.9974 0.8693 0.1786 0.8174 0.9212 0.9228 0.1710 0.8731 0.9724 0.8766 0.1603 0.8301 0.9232 0.9837 0.0212 0.9775 0.9899
dbCAN_2:HMMER 0.9913 0.0187 0.9859 0.9967 0.9375 0.0917 0.9109 0.9641 0.9191 0.1607 0.8724 0.9658 0.9173 0.1137 0.8843 0.9503 0.9856 0.0202 0.9797 0.9914
dbCAN_2:Hotpep 0.9920 0.0180 0.9867 0.9972 0.8737 0.1942 0.8173 0.9301 0.8983 0.2166 0.8354 0.9612 0.8733 0.1924 0.8174 0.9292 0.9827 0.0227 0.9761 0.9893
dbCAN_3 0.9911 0.0199 0.9853 0.9969 0.9835 0.0450 0.9705 0.9966 0.9236 0.1605 0.8770 0.9702 0.9436 0.1100 0.9117 0.9756 0.9901 0.0188 0.9846 0.9955
dbCAN_3:DIAMOND 0.9900 0.0219 0.9837 0.9964 0.9884 0.0359 0.9779 0.9988 0.9171 0.1693 0.8679 0.9662 0.9413 0.1144 0.9080 0.9745 0.9897 0.0200 0.9839 0.9955
dbCAN_3:eCAMI 0.9921 0.0167 0.9873 0.9970 0.7821 0.2611 0.7063 0.8579 0.8514 0.2809 0.7699 0.9330 0.8006 0.2558 0.7263 0.8749 0.9791 0.0233 0.9724 0.9859
dbCAN_3:HMMER 0.9907 0.0202 0.9848 0.9965 0.9880 0.0393 0.9766 0.9994 0.9192 0.1624 0.8720 0.9664 0.9434 0.1112 0.9111 0.9757 0.9901 0.0196 0.9844 0.9958
dbCAN_4 0.9907 0.0201 0.9848 0.9965 0.9937 0.0258 0.9862 1.0012 0.9198 0.1610 0.8731 0.9666 0.9464 0.1085 0.9149 0.9779 0.9907 0.0188 0.9853 0.9962
dbCAN_4:dbCAN-sub 0.9905 0.0202 0.9846 0.9963 0.9892 0.0331 0.9796 0.9988 0.9177 0.1620 0.8707 0.9648 0.9430 0.1090 0.9113 0.9746 0.9902 0.0189 0.9847 0.9956
dbCAN_4:DIAMOND 0.9903 0.0219 0.9839 0.9966 0.9700 0.0952 0.9423 0.9976 0.9196 0.1697 0.8703 0.9689 0.9306 0.1226 0.8950 0.9662 0.9897 0.0198 0.9840 0.9954
dbCAN_4:HMMER 0.9907 0.0202 0.9848 0.9965 0.9880 0.0393 0.9766 0.9994 0.9192 0.1624 0.8720 0.9664 0.9434 0.1112 0.9111 0.9757 0.9901 0.0196 0.9844 0.9958
Scatter plot of sensitivity against specificity for predicting AA CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.17: Scatter plot of sensitivity against specificity for predicting AA CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of AA class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.18: Summary statistics of CAZyme classifiers performances of AA class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of AA class members, overlaying a box plot

Figure 6.19: One dimensional scatter plot of the statistical parameters per test set for the classification of AA class members, overlaying a box plot

6.2.6 CBM class classification

Table 6.8: Overall performance of CAZyme classifiers classification of CBM class members
Prediction_tool Spec Mean Spec Standard Deviation Spec CI Lower Spec CI Upper Sens Mean Sens Standard Deviation Sens CI Lower Sens CI Upper Prec Mean Prec Standard Deviation Prec CI Lower Prec CI Upper F1-score Mean F1-score Standard Deviation F1-score CI Lower F1-score CI Upper Acc Mean Acc Standard Deviation Acc CI Lower Acc CI Upper
CUPP 1.0000 0.0000 1.0000 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.8851 0.0851 0.8662 0.9040
dbCAN_2 0.9922 0.0119 0.9896 0.9948 0.7242 0.1925 0.6813 0.7670 0.9203 0.1333 0.8906 0.9500 0.7970 0.1542 0.7627 0.8314 0.9656 0.0257 0.9599 0.9713
dbCAN_2:DIAMOND 0.9914 0.0149 0.9881 0.9947 0.7854 0.2191 0.7367 0.8342 0.9219 0.1502 0.8885 0.9553 0.8325 0.1762 0.7933 0.8717 0.9719 0.0252 0.9663 0.9775
dbCAN_2:HMMER 0.9961 0.0088 0.9941 0.9980 0.4129 0.2165 0.3648 0.4611 0.8965 0.2304 0.8452 0.9478 0.5414 0.2150 0.4935 0.5892 0.9378 0.0296 0.9312 0.9444
dbCAN_2:Hotpep 0.9014 0.0511 0.8901 0.9128 0.7061 0.2102 0.6594 0.7529 0.4696 0.1731 0.4311 0.5081 0.5508 0.1723 0.5125 0.5892 0.8823 0.0551 0.8701 0.8946
dbCAN_3 0.9943 0.0109 0.9919 0.9968 0.7680 0.1926 0.7252 0.8109 0.9431 0.1355 0.9129 0.9732 0.8354 0.1546 0.8010 0.8698 0.9718 0.0228 0.9667 0.9768
dbCAN_3:DIAMOND 0.9931 0.0144 0.9899 0.9963 0.8675 0.1731 0.8290 0.9060 0.9512 0.1283 0.9227 0.9797 0.8986 0.1383 0.8678 0.9294 0.9814 0.0185 0.9773 0.9855
dbCAN_3:eCAMI 0.9470 0.0523 0.9354 0.9587 0.7346 0.2203 0.6856 0.7836 0.6615 0.2254 0.6113 0.7117 0.6810 0.2079 0.6347 0.7272 0.9267 0.0565 0.9141 0.9392
dbCAN_3:HMMER 0.9960 0.0088 0.9941 0.9980 0.4135 0.2148 0.3657 0.4613 0.8964 0.2304 0.8451 0.9476 0.5425 0.2134 0.4950 0.5900 0.9379 0.0300 0.9313 0.9446
dbCAN_4 0.9951 0.0103 0.9928 0.9974 0.7995 0.1921 0.7567 0.8422 0.9562 0.0919 0.9358 0.9767 0.8547 0.1392 0.8237 0.8857 0.9763 0.0207 0.9717 0.9809
dbCAN_4:dbCAN-sub 0.9927 0.0129 0.9898 0.9955 0.8193 0.1995 0.7749 0.8637 0.9376 0.1151 0.9120 0.9632 0.8576 0.1492 0.8244 0.8908 0.9761 0.0244 0.9707 0.9816
dbCAN_4:DIAMOND 0.9938 0.0146 0.9906 0.9970 0.8773 0.1695 0.8396 0.9151 0.9571 0.1273 0.9288 0.9854 0.9072 0.1362 0.8768 0.9375 0.9830 0.0182 0.9790 0.9870
dbCAN_4:HMMER 0.9958 0.0097 0.9937 0.9980 0.4740 0.2360 0.4215 0.5266 0.9184 0.1999 0.8739 0.9629 0.5954 0.2202 0.5464 0.6444 0.9436 0.0300 0.9370 0.9503
Scatter plot of sensitivity against specificity for predicting CBM CAZy class members per CAZyme classier, overlaying a density map.

Figure 6.20: Scatter plot of sensitivity against specificity for predicting CBM CAZy class members per CAZyme classier, overlaying a density map.

Summary statistics of CAZyme classifiers performances of CBM class classification, plotting the mean plus and minus the 95% confidence interval.

Figure 6.21: Summary statistics of CAZyme classifiers performances of CBM class classification, plotting the mean plus and minus the 95% confidence interval.

One dimensional scatter plot of the statistical parameters per test set for the classification of CBM class members, overlaying a box plot

Figure 6.22: One dimensional scatter plot of the statistical parameters per test set for the classification of CBM class members, overlaying a box plot

6.3 Performance per statistic

Instead of facet wrapping the plots by statistic and producing a plot per CAZy class, we can produce a plot per statistic (sensitivity, precision, etc.) and facet wrap by CAZy class to facilitate comparing between CAZy classes, and evaluting the performance statistic by statistic.

6.3.1 CAZy class Specificity

6.3.2 CAZy class sensitivity

6.3.3 CAZy class precision

6.3.4 CAZy class F1-score

6.3.5 CAZy class accuracy

7 CAZy class multilabel classification

A single CAZyme can be included in multiple CAZy classes leading to the multilabel classification of CAZymes. To address this and evaluate the multilabel classification of CAZy classes the Rand Index (RI) and Adjusted Rand Index (ARI) were calculated.

The RI is the measure of accuracy across all potential classifications of a protein. The RI ranges from 0 (no correct annotations) to 1 (all annotations correct). The ARI is the RI adjusted for chance, where 0 is the equivalent to assigning the CAZy class annotations randomly, -1 where the annotations are systematically handed out incorrectly and 1 where the annotations are all correct.

8 CAZy class taxonomic performance

8.1 Across all of CAZy

8.1.1 Specificity

Table 8.1: Overall performance (represented by the Specificity) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9933 0.0259 0.9908 0.9958 All
dbCAN_2:HMMER 0.9939 0.0264 0.9913 0.9964 All
dbCAN_2:DIAMOND 0.9920 0.0270 0.9894 0.9946 All
dbCAN_2:Hotpep 0.9733 0.0491 0.9685 0.9781 All
dbCAN_3 0.9934 0.0265 0.9909 0.9960 All
dbCAN_3:HMMER 0.9941 0.0261 0.9916 0.9967 All
dbCAN_3:DIAMOND 0.9909 0.0289 0.9881 0.9937 All
dbCAN_3:eCAMI 0.9829 0.0388 0.9791 0.9867 All
dbCAN_4 0.9936 0.0267 0.9910 0.9962 All
dbCAN_4:HMMER 0.9941 0.0262 0.9915 0.9966 All
dbCAN_4:DIAMOND 0.9915 0.0285 0.9888 0.9943 All
dbCAN_4:dbCAN-sub 0.9931 0.0269 0.9905 0.9957 All
CUPP 0.9953 0.0246 0.9929 0.9977 All
dbCAN_2 0.9951 0.0157 0.9930 0.9973 Bacteria
dbCAN_2:HMMER 0.9959 0.0139 0.9939 0.9978 Bacteria
dbCAN_2:DIAMOND 0.9932 0.0185 0.9907 0.9958 Bacteria
dbCAN_2:Hotpep 0.9695 0.0509 0.9625 0.9766 Bacteria
dbCAN_3 0.9950 0.0160 0.9928 0.9973 Bacteria
dbCAN_3:HMMER 0.9962 0.0133 0.9944 0.9981 Bacteria
dbCAN_3:DIAMOND 0.9930 0.0191 0.9904 0.9957 Bacteria
dbCAN_3:eCAMI 0.9798 0.0392 0.9743 0.9852 Bacteria
dbCAN_4 0.9958 0.0137 0.9939 0.9977 Bacteria
dbCAN_4:HMMER 0.9962 0.0133 0.9944 0.9981 Bacteria
dbCAN_4:DIAMOND 0.9942 0.0173 0.9918 0.9966 Bacteria
dbCAN_4:dbCAN-sub 0.9954 0.0139 0.9935 0.9973 Bacteria
CUPP 0.9970 0.0128 0.9952 0.9988 Bacteria
dbCAN_2 0.9916 0.0328 0.9871 0.9961 Eukaryote
dbCAN_2:HMMER 0.9920 0.0343 0.9873 0.9967 Eukaryote
dbCAN_2:DIAMOND 0.9909 0.0332 0.9864 0.9954 Eukaryote
dbCAN_2:Hotpep 0.9769 0.0472 0.9705 0.9833 Eukaryote
dbCAN_3 0.9919 0.0335 0.9874 0.9965 Eukaryote
dbCAN_3:HMMER 0.9921 0.0341 0.9875 0.9968 Eukaryote
dbCAN_3:DIAMOND 0.9888 0.0357 0.9839 0.9936 Eukaryote
dbCAN_3:eCAMI 0.9859 0.0382 0.9807 0.9911 Eukaryote
dbCAN_4 0.9916 0.0348 0.9868 0.9963 Eukaryote
dbCAN_4:HMMER 0.9920 0.0341 0.9874 0.9967 Eukaryote
dbCAN_4:DIAMOND 0.9890 0.0360 0.9841 0.9939 Eukaryote
dbCAN_4:dbCAN-sub 0.9908 0.0350 0.9861 0.9956 Eukaryote
CUPP 0.9937 0.0320 0.9894 0.9981 Eukaryote

8.1.2 Sensitivity

Table 8.2: Overall performance (represented by the Sensitivity) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.8830 0.1710 0.8592 0.9068 Bacteria
dbCAN_2:HMMER 0.8194 0.2217 0.7885 0.8502 Bacteria
dbCAN_2:DIAMOND 0.8889 0.1819 0.8636 0.9142 Bacteria
dbCAN_2:Hotpep 0.8176 0.2115 0.7882 0.8470 Bacteria
dbCAN_3 0.9128 0.1608 0.8905 0.9352 Bacteria
dbCAN_3:HMMER 0.8256 0.2214 0.7948 0.8564 Bacteria
dbCAN_3:DIAMOND 0.9334 0.1427 0.9136 0.9533 Bacteria
dbCAN_3:eCAMI 0.8435 0.2067 0.8148 0.8722 Bacteria
dbCAN_4 0.9477 0.1076 0.9327 0.9626 Bacteria
dbCAN_4:HMMER 0.8545 0.1946 0.8275 0.8816 Bacteria
dbCAN_4:DIAMOND 0.9317 0.1461 0.9114 0.9520 Bacteria
dbCAN_4:dbCAN-sub 0.9558 0.1011 0.9417 0.9699 Bacteria
CUPP 0.7168 0.3818 0.6637 0.7699 Bacteria
dbCAN_2 0.8644 0.1754 0.8405 0.8882 Eukaryote
dbCAN_2:HMMER 0.7960 0.2670 0.7596 0.8323 Eukaryote
dbCAN_2:DIAMOND 0.8583 0.2229 0.8280 0.8887 Eukaryote
dbCAN_2:Hotpep 0.7877 0.1978 0.7608 0.8146 Eukaryote
dbCAN_3 0.9241 0.1291 0.9066 0.9417 Eukaryote
dbCAN_3:HMMER 0.8252 0.2649 0.7892 0.8613 Eukaryote
dbCAN_3:DIAMOND 0.9649 0.1093 0.9500 0.9798 Eukaryote
dbCAN_3:eCAMI 0.7790 0.2176 0.7494 0.8086 Eukaryote
dbCAN_4 0.9234 0.1457 0.9036 0.9433 Eukaryote
dbCAN_4:HMMER 0.8317 0.2562 0.7968 0.8665 Eukaryote
dbCAN_4:DIAMOND 0.9831 0.0487 0.9765 0.9898 Eukaryote
dbCAN_4:dbCAN-sub 0.9189 0.1433 0.8994 0.9384 Eukaryote
CUPP 0.7121 0.3682 0.6620 0.7622 Eukaryote

8.1.3 Precision

Table 8.3: Overall performance (represented by the Precision) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9664 0.1215 0.9495 0.9833 Bacteria
dbCAN_2:HMMER 0.9706 0.0968 0.9571 0.9840 Bacteria
dbCAN_2:DIAMOND 0.9647 0.1349 0.9459 0.9834 Bacteria
dbCAN_2:Hotpep 0.8551 0.2420 0.8214 0.8887 Bacteria
dbCAN_3 0.9686 0.1160 0.9525 0.9848 Bacteria
dbCAN_3:HMMER 0.9725 0.0929 0.9596 0.9854 Bacteria
dbCAN_3:DIAMOND 0.9712 0.1113 0.9557 0.9867 Bacteria
dbCAN_3:eCAMI 0.8865 0.2140 0.8568 0.9162 Bacteria
dbCAN_4 0.9783 0.0746 0.9680 0.9887 Bacteria
dbCAN_4:HMMER 0.9755 0.0855 0.9636 0.9873 Bacteria
dbCAN_4:DIAMOND 0.9734 0.1103 0.9580 0.9887 Bacteria
dbCAN_4:dbCAN-sub 0.9767 0.0751 0.9663 0.9872 Bacteria
CUPP 0.7806 0.4022 0.7246 0.8365 Bacteria
dbCAN_2 0.9476 0.1358 0.9291 0.9661 Eukaryote
dbCAN_2:HMMER 0.9390 0.1748 0.9152 0.9628 Eukaryote
dbCAN_2:DIAMOND 0.9333 0.1738 0.9097 0.9569 Eukaryote
dbCAN_2:Hotpep 0.8568 0.2458 0.8233 0.8902 Eukaryote
dbCAN_3 0.9622 0.1043 0.9480 0.9764 Eukaryote
dbCAN_3:HMMER 0.9427 0.1653 0.9202 0.9651 Eukaryote
dbCAN_3:DIAMOND 0.9462 0.1359 0.9277 0.9646 Eukaryote
dbCAN_3:eCAMI 0.8894 0.2221 0.8592 0.9196 Eukaryote
dbCAN_4 0.9589 0.1086 0.9442 0.9737 Eukaryote
dbCAN_4:HMMER 0.9483 0.1511 0.9277 0.9688 Eukaryote
dbCAN_4:DIAMOND 0.9537 0.1191 0.9375 0.9699 Eukaryote
dbCAN_4:dbCAN-sub 0.9524 0.1182 0.9364 0.9685 Eukaryote
CUPP 0.7765 0.3957 0.7227 0.8304 Eukaryote

8.1.4 F1-score

Table 8.4: Overall performance (represented by the F1-score) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9120 0.1418 0.8923 0.9317 Bacteria
dbCAN_2:HMMER 0.8680 0.1741 0.8438 0.8922 Bacteria
dbCAN_2:DIAMOND 0.9136 0.1536 0.8922 0.9349 Bacteria
dbCAN_2:Hotpep 0.8140 0.2090 0.7850 0.8431 Bacteria
dbCAN_3 0.9298 0.1366 0.9108 0.9488 Bacteria
dbCAN_3:HMMER 0.8729 0.1729 0.8489 0.8970 Bacteria
dbCAN_3:DIAMOND 0.9434 0.1209 0.9266 0.9602 Bacteria
dbCAN_3:eCAMI 0.8506 0.1978 0.8232 0.8781 Bacteria
dbCAN_4 0.9567 0.0853 0.9449 0.9686 Bacteria
dbCAN_4:HMMER 0.8949 0.1487 0.8742 0.9156 Bacteria
dbCAN_4:DIAMOND 0.9432 0.1237 0.9260 0.9604 Bacteria
dbCAN_4:dbCAN-sub 0.9607 0.0815 0.9493 0.9720 Bacteria
CUPP 0.7409 0.3859 0.6872 0.7946 Bacteria
dbCAN_2 0.8934 0.1458 0.8736 0.9133 Eukaryote
dbCAN_2:HMMER 0.8353 0.2285 0.8042 0.8663 Eukaryote
dbCAN_2:DIAMOND 0.8794 0.1936 0.8531 0.9057 Eukaryote
dbCAN_2:Hotpep 0.7977 0.2034 0.7700 0.8253 Eukaryote
dbCAN_3 0.9339 0.1028 0.9199 0.9478 Eukaryote
dbCAN_3:HMMER 0.8532 0.2236 0.8227 0.8836 Eukaryote
dbCAN_3:DIAMOND 0.9484 0.1137 0.9329 0.9638 Eukaryote
dbCAN_3:eCAMI 0.8147 0.2055 0.7867 0.8426 Eukaryote
dbCAN_4 0.9301 0.1179 0.9140 0.9461 Eukaryote
dbCAN_4:HMMER 0.8591 0.2146 0.8299 0.8883 Eukaryote
dbCAN_4:DIAMOND 0.9629 0.0818 0.9517 0.9740 Eukaryote
dbCAN_4:dbCAN-sub 0.9249 0.1207 0.9085 0.9413 Eukaryote
CUPP 0.7370 0.3750 0.6860 0.7880 Eukaryote

8.1.5 Accuracy

Table 8.5: Overall performance (represented by the Accuracy) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9747 0.0436 0.9687 0.9808 Bacteria
dbCAN_2:HMMER 0.9646 0.0478 0.9580 0.9713 Bacteria
dbCAN_2:DIAMOND 0.9766 0.0455 0.9703 0.9830 Bacteria
dbCAN_2:Hotpep 0.9366 0.0664 0.9274 0.9458 Bacteria
dbCAN_3 0.9813 0.0408 0.9757 0.9870 Bacteria
dbCAN_3:HMMER 0.9658 0.0480 0.9591 0.9725 Bacteria
dbCAN_3:DIAMOND 0.9844 0.0387 0.9790 0.9898 Bacteria
dbCAN_3:eCAMI 0.9578 0.0539 0.9503 0.9653 Bacteria
dbCAN_4 0.9855 0.0392 0.9801 0.9910 Bacteria
dbCAN_4:HMMER 0.9684 0.0473 0.9618 0.9750 Bacteria
dbCAN_4:DIAMOND 0.9839 0.0400 0.9784 0.9895 Bacteria
dbCAN_4:dbCAN-sub 0.9864 0.0359 0.9814 0.9914 Bacteria
CUPP 0.9516 0.0743 0.9413 0.9620 Bacteria
dbCAN_2 0.9728 0.0431 0.9670 0.9787 Eukaryote
dbCAN_2:HMMER 0.9656 0.0332 0.9611 0.9702 Eukaryote
dbCAN_2:DIAMOND 0.9759 0.0448 0.9698 0.9820 Eukaryote
dbCAN_2:Hotpep 0.9401 0.0698 0.9306 0.9496 Eukaryote
dbCAN_3 0.9825 0.0259 0.9790 0.9860 Eukaryote
dbCAN_3:HMMER 0.9690 0.0331 0.9645 0.9735 Eukaryote
dbCAN_3:DIAMOND 0.9882 0.0241 0.9850 0.9915 Eukaryote
dbCAN_3:eCAMI 0.9528 0.0587 0.9449 0.9608 Eukaryote
dbCAN_4 0.9822 0.0252 0.9788 0.9857 Eukaryote
dbCAN_4:HMMER 0.9695 0.0325 0.9650 0.9739 Eukaryote
dbCAN_4:DIAMOND 0.9900 0.0226 0.9870 0.9931 Eukaryote
dbCAN_4:dbCAN-sub 0.9803 0.0264 0.9767 0.9839 Eukaryote
CUPP 0.9559 0.0466 0.9496 0.9623 Eukaryote

8.2 Per CAZy class

8.2.1 Specificity

Table 8.6: Overall performance (represented by the Specificity) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9917 0.0245 0.9862 0.9972 All GH
dbCAN_2:HMMER 0.9927 0.0242 0.9873 0.9981 All GH
dbCAN_2:DIAMOND 0.9862 0.0276 0.9801 0.9924 All GH
dbCAN_2:Hotpep 0.9833 0.0286 0.9769 0.9897 All GH
dbCAN_3 0.9923 0.0228 0.9873 0.9974 All GH
dbCAN_3:HMMER 0.9939 0.0224 0.9889 0.9988 All GH
dbCAN_3:DIAMOND 0.9844 0.0292 0.9779 0.9909 All GH
dbCAN_3:eCAMI 0.9848 0.0295 0.9782 0.9913 All GH
dbCAN_4 0.9927 0.0240 0.9873 0.9980 All GH
dbCAN_4:HMMER 0.9939 0.0224 0.9889 0.9988 All GH
dbCAN_4:DIAMOND 0.9864 0.0279 0.9802 0.9927 All GH
dbCAN_4:dbCAN-sub 0.9924 0.0238 0.9871 0.9977 All GH
CUPP 0.9933 0.0212 0.9886 0.9980 All GH
dbCAN_2 0.9932 0.0144 0.9886 0.9978 Bacteria GH
dbCAN_2:HMMER 0.9945 0.0133 0.9902 0.9987 Bacteria GH
dbCAN_2:DIAMOND 0.9844 0.0230 0.9770 0.9918 Bacteria GH
dbCAN_2:Hotpep 0.9753 0.0283 0.9662 0.9843 Bacteria GH
dbCAN_3 0.9932 0.0137 0.9888 0.9975 Bacteria GH
dbCAN_3:HMMER 0.9954 0.0124 0.9914 0.9993 Bacteria GH
dbCAN_3:DIAMOND 0.9846 0.0239 0.9769 0.9922 Bacteria GH
dbCAN_3:eCAMI 0.9794 0.0278 0.9705 0.9883 Bacteria GH
dbCAN_4 0.9933 0.0141 0.9888 0.9978 Bacteria GH
dbCAN_4:HMMER 0.9954 0.0124 0.9914 0.9993 Bacteria GH
dbCAN_4:DIAMOND 0.9891 0.0196 0.9829 0.9954 Bacteria GH
dbCAN_4:dbCAN-sub 0.9927 0.0143 0.9881 0.9973 Bacteria GH
CUPP 0.9935 0.0140 0.9890 0.9980 Bacteria GH
dbCAN_2 0.9902 0.0318 0.9801 1.0004 Eukaryote GH
dbCAN_2:HMMER 0.9909 0.0317 0.9808 1.0010 Eukaryote GH
dbCAN_2:DIAMOND 0.9881 0.0318 0.9779 0.9982 Eukaryote GH
dbCAN_2:Hotpep 0.9913 0.0269 0.9827 0.9999 Eukaryote GH
dbCAN_3 0.9915 0.0295 0.9821 1.0009 Eukaryote GH
dbCAN_3:HMMER 0.9923 0.0293 0.9830 1.0017 Eukaryote GH
dbCAN_3:DIAMOND 0.9843 0.0339 0.9734 0.9951 Eukaryote GH
dbCAN_3:eCAMI 0.9901 0.0306 0.9803 0.9998 Eukaryote GH
dbCAN_4 0.9920 0.0312 0.9821 1.0020 Eukaryote GH
dbCAN_4:HMMER 0.9923 0.0293 0.9830 1.0017 Eukaryote GH
dbCAN_4:DIAMOND 0.9837 0.0344 0.9727 0.9947 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9921 0.0308 0.9823 1.0019 Eukaryote GH
CUPP 0.9931 0.0268 0.9845 1.0016 Eukaryote GH
dbCAN_2 0.9927 0.0454 0.9826 1.0028 All GT
dbCAN_2:HMMER 0.9904 0.0487 0.9796 1.0012 All GT
dbCAN_2:DIAMOND 0.9919 0.0460 0.9817 1.0021 All GT
dbCAN_2:Hotpep 0.9924 0.0421 0.9830 1.0017 All GT
dbCAN_3 0.9914 0.0474 0.9808 1.0019 All GT
dbCAN_3:HMMER 0.9904 0.0487 0.9796 1.0012 All GT
dbCAN_3:DIAMOND 0.9893 0.0486 0.9784 1.0001 All GT
dbCAN_3:eCAMI 0.9922 0.0439 0.9824 1.0019 All GT
dbCAN_4 0.9900 0.0492 0.9790 1.0009 All GT
dbCAN_4:HMMER 0.9904 0.0487 0.9796 1.0012 All GT
dbCAN_4:DIAMOND 0.9893 0.0488 0.9785 1.0002 All GT
dbCAN_4:dbCAN-sub 0.9900 0.0492 0.9790 1.0009 All GT
CUPP 0.9921 0.0463 0.9818 1.0024 All GT
dbCAN_2 0.9991 0.0041 0.9978 1.0004 Bacteria GT
dbCAN_2:HMMER 0.9990 0.0044 0.9976 1.0004 Bacteria GT
dbCAN_2:DIAMOND 0.9986 0.0070 0.9963 1.0008 Bacteria GT
dbCAN_2:Hotpep 0.9976 0.0060 0.9957 0.9995 Bacteria GT
dbCAN_3 0.9991 0.0042 0.9977 1.0004 Bacteria GT
dbCAN_3:HMMER 0.9990 0.0044 0.9976 1.0004 Bacteria GT
dbCAN_3:DIAMOND 0.9983 0.0073 0.9959 1.0006 Bacteria GT
dbCAN_3:eCAMI 0.9976 0.0083 0.9949 1.0003 Bacteria GT
dbCAN_4 0.9987 0.0047 0.9972 1.0002 Bacteria GT
dbCAN_4:HMMER 0.9990 0.0044 0.9976 1.0004 Bacteria GT
dbCAN_4:DIAMOND 0.9983 0.0073 0.9959 1.0006 Bacteria GT
dbCAN_4:dbCAN-sub 0.9987 0.0047 0.9972 1.0002 Bacteria GT
CUPP 0.9994 0.0039 0.9982 1.0006 Bacteria GT
dbCAN_2 0.9864 0.0638 0.9659 1.0068 Eukaryote GT
dbCAN_2:HMMER 0.9818 0.0680 0.9600 1.0035 Eukaryote GT
dbCAN_2:DIAMOND 0.9852 0.0644 0.9646 1.0058 Eukaryote GT
dbCAN_2:Hotpep 0.9871 0.0592 0.9682 1.0060 Eukaryote GT
dbCAN_3 0.9837 0.0664 0.9625 1.0050 Eukaryote GT
dbCAN_3:HMMER 0.9818 0.0680 0.9600 1.0035 Eukaryote GT
dbCAN_3:DIAMOND 0.9803 0.0676 0.9587 1.0019 Eukaryote GT
dbCAN_3:eCAMI 0.9867 0.0614 0.9671 1.0064 Eukaryote GT
dbCAN_4 0.9812 0.0687 0.9592 1.0032 Eukaryote GT
dbCAN_4:HMMER 0.9818 0.0680 0.9600 1.0035 Eukaryote GT
dbCAN_4:DIAMOND 0.9804 0.0678 0.9587 1.0021 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9812 0.0687 0.9592 1.0032 Eukaryote GT
CUPP 0.9848 0.0649 0.9640 1.0056 Eukaryote GT
dbCAN_2 0.9998 0.0012 0.9995 1.0002 All PL
dbCAN_2:HMMER 0.9998 0.0012 0.9995 1.0002 All PL
dbCAN_2:DIAMOND 0.9996 0.0019 0.9991 1.0002 All PL
dbCAN_2:Hotpep 0.9993 0.0027 0.9985 1.0001 All PL
dbCAN_3 0.9994 0.0025 0.9986 1.0001 All PL
dbCAN_3:HMMER 0.9994 0.0025 0.9986 1.0001 All PL
dbCAN_3:DIAMOND 0.9989 0.0039 0.9978 1.0000 All PL
dbCAN_3:eCAMI 0.9996 0.0018 0.9991 1.0002 All PL
dbCAN_4 0.9994 0.0025 0.9986 1.0001 All PL
dbCAN_4:HMMER 0.9994 0.0025 0.9986 1.0001 All PL
dbCAN_4:DIAMOND 0.9989 0.0039 0.9978 1.0000 All PL
dbCAN_4:dbCAN-sub 0.9994 0.0025 0.9986 1.0001 All PL
CUPP 0.9996 0.0019 0.9990 1.0002 All PL
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_2:DIAMOND 0.9997 0.0018 0.9990 1.0003 Bacteria PL
dbCAN_2:Hotpep 0.9992 0.0030 0.9981 1.0003 Bacteria PL
dbCAN_3 0.9996 0.0021 0.9988 1.0004 Bacteria PL
dbCAN_3:HMMER 0.9996 0.0021 0.9988 1.0004 Bacteria PL
dbCAN_3:DIAMOND 0.9993 0.0028 0.9982 1.0003 Bacteria PL
dbCAN_3:eCAMI 0.9997 0.0017 0.9991 1.0003 Bacteria PL
dbCAN_4 0.9996 0.0021 0.9988 1.0004 Bacteria PL
dbCAN_4:HMMER 0.9996 0.0021 0.9988 1.0004 Bacteria PL
dbCAN_4:DIAMOND 0.9993 0.0028 0.9982 1.0003 Bacteria PL
dbCAN_4:dbCAN-sub 0.9996 0.0021 0.9988 1.0004 Bacteria PL
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_2 0.9995 0.0021 0.9984 1.0006 Eukaryote PL
dbCAN_2:HMMER 0.9995 0.0021 0.9984 1.0006 Eukaryote PL
dbCAN_2:DIAMOND 0.9995 0.0021 0.9984 1.0006 Eukaryote PL
dbCAN_2:Hotpep 0.9995 0.0021 0.9984 1.0006 Eukaryote PL
dbCAN_3 0.9989 0.0031 0.9973 1.0005 Eukaryote PL
dbCAN_3:HMMER 0.9989 0.0031 0.9973 1.0005 Eukaryote PL
dbCAN_3:DIAMOND 0.9982 0.0054 0.9955 1.0010 Eukaryote PL
dbCAN_3:eCAMI 0.9995 0.0021 0.9984 1.0006 Eukaryote PL
dbCAN_4 0.9989 0.0030 0.9974 1.0005 Eukaryote PL
dbCAN_4:HMMER 0.9989 0.0031 0.9973 1.0005 Eukaryote PL
dbCAN_4:DIAMOND 0.9982 0.0053 0.9955 1.0010 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9989 0.0030 0.9974 1.0005 Eukaryote PL
CUPP 0.9989 0.0031 0.9973 1.0005 Eukaryote PL
dbCAN_2 0.9937 0.0220 0.9887 0.9988 All CE
dbCAN_2:HMMER 0.9945 0.0187 0.9902 0.9988 All CE
dbCAN_2:DIAMOND 0.9941 0.0210 0.9893 0.9989 All CE
dbCAN_2:Hotpep 0.9905 0.0229 0.9852 0.9957 All CE
dbCAN_3 0.9936 0.0224 0.9885 0.9988 All CE
dbCAN_3:HMMER 0.9953 0.0179 0.9912 0.9994 All CE
dbCAN_3:DIAMOND 0.9925 0.0234 0.9872 0.9979 All CE
dbCAN_3:eCAMI 0.9926 0.0213 0.9877 0.9975 All CE
dbCAN_4 0.9953 0.0179 0.9912 0.9994 All CE
dbCAN_4:HMMER 0.9953 0.0179 0.9912 0.9994 All CE
dbCAN_4:DIAMOND 0.9931 0.0227 0.9879 0.9983 All CE
dbCAN_4:dbCAN-sub 0.9952 0.0181 0.9911 0.9993 All CE
CUPP 0.9955 0.0178 0.9914 0.9995 All CE
dbCAN_2 0.9886 0.0294 0.9792 0.9980 Bacteria CE
dbCAN_2:HMMER 0.9906 0.0250 0.9826 0.9986 Bacteria CE
dbCAN_2:DIAMOND 0.9891 0.0281 0.9801 0.9981 Bacteria CE
dbCAN_2:Hotpep 0.9829 0.0295 0.9735 0.9924 Bacteria CE
dbCAN_3 0.9884 0.0299 0.9789 0.9980 Bacteria CE
dbCAN_3:HMMER 0.9919 0.0241 0.9842 0.9996 Bacteria CE
dbCAN_3:DIAMOND 0.9894 0.0285 0.9803 0.9985 Bacteria CE
dbCAN_3:eCAMI 0.9862 0.0279 0.9773 0.9951 Bacteria CE
dbCAN_4 0.9919 0.0241 0.9842 0.9996 Bacteria CE
dbCAN_4:HMMER 0.9919 0.0241 0.9842 0.9996 Bacteria CE
dbCAN_4:DIAMOND 0.9904 0.0276 0.9816 0.9993 Bacteria CE
dbCAN_4:dbCAN-sub 0.9919 0.0241 0.9842 0.9996 Bacteria CE
CUPP 0.9919 0.0239 0.9843 0.9995 Bacteria CE
dbCAN_2 0.9994 0.0035 0.9982 1.0006 Eukaryote CE
dbCAN_2:HMMER 0.9988 0.0042 0.9974 1.0003 Eukaryote CE
dbCAN_2:DIAMOND 0.9997 0.0018 0.9991 1.0003 Eukaryote CE
dbCAN_2:Hotpep 0.9989 0.0042 0.9974 1.0003 Eukaryote CE
dbCAN_3 0.9994 0.0035 0.9982 1.0006 Eukaryote CE
dbCAN_3:HMMER 0.9991 0.0039 0.9978 1.0004 Eukaryote CE
dbCAN_3:DIAMOND 0.9960 0.0156 0.9907 1.0013 Eukaryote CE
dbCAN_3:eCAMI 0.9997 0.0018 0.9991 1.0003 Eukaryote CE
dbCAN_4 0.9991 0.0038 0.9978 1.0004 Eukaryote CE
dbCAN_4:HMMER 0.9991 0.0039 0.9978 1.0004 Eukaryote CE
dbCAN_4:DIAMOND 0.9960 0.0155 0.9908 1.0013 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9989 0.0054 0.9970 1.0007 Eukaryote CE
CUPP 0.9994 0.0035 0.9982 1.0006 Eukaryote CE
dbCAN_2 0.9919 0.0178 0.9868 0.9971 All AA
dbCAN_2:HMMER 0.9913 0.0187 0.9859 0.9967 All AA
dbCAN_2:DIAMOND 0.9922 0.0178 0.9870 0.9974 All AA
dbCAN_2:Hotpep 0.9920 0.0180 0.9867 0.9972 All AA
dbCAN_3 0.9911 0.0199 0.9853 0.9969 All AA
dbCAN_3:HMMER 0.9907 0.0202 0.9848 0.9965 All AA
dbCAN_3:DIAMOND 0.9900 0.0219 0.9837 0.9964 All AA
dbCAN_3:eCAMI 0.9921 0.0167 0.9873 0.9970 All AA
dbCAN_4 0.9907 0.0201 0.9848 0.9965 All AA
dbCAN_4:HMMER 0.9907 0.0202 0.9848 0.9965 All AA
dbCAN_4:DIAMOND 0.9903 0.0219 0.9839 0.9966 All AA
dbCAN_4:dbCAN-sub 0.9905 0.0202 0.9846 0.9963 All AA
CUPP 0.9919 0.0175 0.9868 0.9970 All AA
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:Hotpep 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:eCAMI 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria AA
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2 0.9896 0.0197 0.9830 0.9961 Eukaryote AA
dbCAN_2:HMMER 0.9887 0.0206 0.9818 0.9956 Eukaryote AA
dbCAN_2:DIAMOND 0.9899 0.0198 0.9833 0.9965 Eukaryote AA
dbCAN_2:Hotpep 0.9896 0.0200 0.9829 0.9962 Eukaryote AA
dbCAN_3 0.9884 0.0220 0.9811 0.9958 Eukaryote AA
dbCAN_3:HMMER 0.9879 0.0223 0.9804 0.9953 Eukaryote AA
dbCAN_3:DIAMOND 0.9871 0.0242 0.9790 0.9951 Eukaryote AA
dbCAN_3:eCAMI 0.9898 0.0184 0.9836 0.9959 Eukaryote AA
dbCAN_4 0.9879 0.0223 0.9805 0.9953 Eukaryote AA
dbCAN_4:HMMER 0.9879 0.0223 0.9804 0.9953 Eukaryote AA
dbCAN_4:DIAMOND 0.9874 0.0242 0.9793 0.9954 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9876 0.0223 0.9802 0.9951 Eukaryote AA
CUPP 0.9895 0.0193 0.9831 0.9960 Eukaryote AA
dbCAN_2 0.9922 0.0119 0.9896 0.9948 All CBM
dbCAN_2:HMMER 0.9961 0.0088 0.9941 0.9980 All CBM
dbCAN_2:DIAMOND 0.9914 0.0149 0.9881 0.9947 All CBM
dbCAN_2:Hotpep 0.9014 0.0511 0.8901 0.9128 All CBM
dbCAN_3 0.9943 0.0109 0.9919 0.9968 All CBM
dbCAN_3:HMMER 0.9960 0.0088 0.9941 0.9980 All CBM
dbCAN_3:DIAMOND 0.9931 0.0144 0.9899 0.9963 All CBM
dbCAN_3:eCAMI 0.9470 0.0523 0.9354 0.9587 All CBM
dbCAN_4 0.9951 0.0103 0.9928 0.9974 All CBM
dbCAN_4:HMMER 0.9958 0.0097 0.9937 0.9980 All CBM
dbCAN_4:DIAMOND 0.9938 0.0146 0.9906 0.9970 All CBM
dbCAN_4:dbCAN-sub 0.9927 0.0129 0.9898 0.9955 All CBM
CUPP 1.0000 0.0000 1.0000 1.0000 All CBM
dbCAN_2 0.9947 0.0091 0.9918 0.9976 Bacteria CBM
dbCAN_2:HMMER 0.9951 0.0106 0.9917 0.9985 Bacteria CBM
dbCAN_2:DIAMOND 0.9942 0.0146 0.9895 0.9988 Bacteria CBM
dbCAN_2:Hotpep 0.8917 0.0580 0.8732 0.9103 Bacteria CBM
dbCAN_3 0.9947 0.0110 0.9912 0.9982 Bacteria CBM
dbCAN_3:HMMER 0.9951 0.0106 0.9917 0.9985 Bacteria CBM
dbCAN_3:DIAMOND 0.9933 0.0164 0.9880 0.9985 Bacteria CBM
dbCAN_3:eCAMI 0.9349 0.0588 0.9161 0.9537 Bacteria CBM
dbCAN_4 0.9953 0.0106 0.9920 0.9987 Bacteria CBM
dbCAN_4:HMMER 0.9951 0.0106 0.9917 0.9985 Bacteria CBM
dbCAN_4:DIAMOND 0.9937 0.0154 0.9887 0.9986 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9939 0.0113 0.9903 0.9975 Bacteria CBM
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria CBM
dbCAN_2 0.9897 0.0138 0.9853 0.9941 Eukaryote CBM
dbCAN_2:HMMER 0.9970 0.0066 0.9949 0.9991 Eukaryote CBM
dbCAN_2:DIAMOND 0.9886 0.0149 0.9839 0.9934 Eukaryote CBM
dbCAN_2:Hotpep 0.9112 0.0417 0.8978 0.9245 Eukaryote CBM
dbCAN_3 0.9940 0.0109 0.9905 0.9975 Eukaryote CBM
dbCAN_3:HMMER 0.9970 0.0066 0.9949 0.9991 Eukaryote CBM
dbCAN_3:DIAMOND 0.9929 0.0123 0.9889 0.9968 Eukaryote CBM
dbCAN_3:eCAMI 0.9591 0.0424 0.9456 0.9727 Eukaryote CBM
dbCAN_4 0.9949 0.0102 0.9916 0.9982 Eukaryote CBM
dbCAN_4:HMMER 0.9966 0.0087 0.9938 0.9993 Eukaryote CBM
dbCAN_4:DIAMOND 0.9939 0.0139 0.9895 0.9984 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.9914 0.0144 0.9868 0.9961 Eukaryote CBM
CUPP 1.0000 0.0000 1.0000 1.0000 Eukaryote CBM

8.2.2 Sensitivity

Table 8.7: Overall performance (represented by the Sensitivity) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9354 0.0907 0.9152 0.9556 All GH
dbCAN_2:HMMER 0.9080 0.0824 0.8896 0.9263 All GH
dbCAN_2:DIAMOND 0.9379 0.1064 0.9142 0.9615 All GH
dbCAN_2:Hotpep 0.8641 0.1181 0.8378 0.8904 All GH
dbCAN_3 0.9567 0.0771 0.9396 0.9739 All GH
dbCAN_3:HMMER 0.9198 0.0830 0.9013 0.9383 All GH
dbCAN_3:DIAMOND 0.9760 0.0673 0.9610 0.9910 All GH
dbCAN_3:eCAMI 0.8764 0.1097 0.8520 0.9008 All GH
dbCAN_4 0.9500 0.0802 0.9322 0.9679 All GH
dbCAN_4:HMMER 0.9202 0.0828 0.9017 0.9386 All GH
dbCAN_4:DIAMOND 0.9737 0.0763 0.9567 0.9906 All GH
dbCAN_4:dbCAN-sub 0.9473 0.0789 0.9297 0.9648 All GH
CUPP 0.9080 0.0675 0.8930 0.9230 All GH
dbCAN_2 0.9364 0.1023 0.9037 0.9691 Bacteria GH
dbCAN_2:HMMER 0.9117 0.1098 0.8766 0.9469 Bacteria GH
dbCAN_2:DIAMOND 0.9342 0.1159 0.8972 0.9713 Bacteria GH
dbCAN_2:Hotpep 0.8977 0.1006 0.8655 0.9298 Bacteria GH
dbCAN_3 0.9597 0.1006 0.9275 0.9919 Bacteria GH
dbCAN_3:HMMER 0.9166 0.1118 0.8808 0.9524 Bacteria GH
dbCAN_3:DIAMOND 0.9649 0.0900 0.9361 0.9936 Bacteria GH
dbCAN_3:eCAMI 0.9142 0.1063 0.8802 0.9482 Bacteria GH
dbCAN_4 0.9534 0.1089 0.9186 0.9882 Bacteria GH
dbCAN_4:HMMER 0.9173 0.1116 0.8816 0.9530 Bacteria GH
dbCAN_4:DIAMOND 0.9588 0.1047 0.9253 0.9923 Bacteria GH
dbCAN_4:dbCAN-sub 0.9549 0.1056 0.9211 0.9887 Bacteria GH
CUPP 0.9170 0.0792 0.8917 0.9423 Bacteria GH
dbCAN_2 0.9343 0.0788 0.9091 0.9595 Eukaryote GH
dbCAN_2:HMMER 0.9042 0.0411 0.8911 0.9173 Eukaryote GH
dbCAN_2:DIAMOND 0.9415 0.0973 0.9104 0.9726 Eukaryote GH
dbCAN_2:Hotpep 0.8305 0.1257 0.7903 0.8707 Eukaryote GH
dbCAN_3 0.9538 0.0435 0.9398 0.9677 Eukaryote GH
dbCAN_3:HMMER 0.9230 0.0375 0.9110 0.9350 Eukaryote GH
dbCAN_3:DIAMOND 0.9872 0.0290 0.9779 0.9964 Eukaryote GH
dbCAN_3:eCAMI 0.8385 0.1007 0.8063 0.8708 Eukaryote GH
dbCAN_4 0.9467 0.0341 0.9358 0.9576 Eukaryote GH
dbCAN_4:HMMER 0.9230 0.0375 0.9110 0.9350 Eukaryote GH
dbCAN_4:DIAMOND 0.9885 0.0193 0.9823 0.9947 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9396 0.0363 0.9280 0.9512 Eukaryote GH
CUPP 0.8990 0.0527 0.8821 0.9159 Eukaryote GH
dbCAN_2 0.8845 0.1378 0.8538 0.9152 All GT
dbCAN_2:HMMER 0.8627 0.1126 0.8376 0.8877 All GT
dbCAN_2:DIAMOND 0.9255 0.1508 0.8919 0.9591 All GT
dbCAN_2:Hotpep 0.7254 0.1807 0.6852 0.7656 All GT
dbCAN_3 0.9421 0.0971 0.9205 0.9637 All GT
dbCAN_3:HMMER 0.8654 0.1113 0.8406 0.8901 All GT
dbCAN_3:DIAMOND 0.9774 0.0897 0.9574 0.9973 All GT
dbCAN_3:eCAMI 0.8500 0.1524 0.8161 0.8839 All GT
dbCAN_4 0.9578 0.0921 0.9373 0.9783 All GT
dbCAN_4:HMMER 0.8657 0.1110 0.8410 0.8904 All GT
dbCAN_4:DIAMOND 0.9751 0.0914 0.9548 0.9954 All GT
dbCAN_4:dbCAN-sub 0.9538 0.0835 0.9352 0.9724 All GT
CUPP 0.8536 0.1107 0.8289 0.8782 All GT
dbCAN_2 0.8819 0.1395 0.8373 0.9265 Bacteria GT
dbCAN_2:HMMER 0.8491 0.1312 0.8071 0.8910 Bacteria GT
dbCAN_2:DIAMOND 0.9240 0.1583 0.8733 0.9746 Bacteria GT
dbCAN_2:Hotpep 0.6785 0.1870 0.6187 0.7383 Bacteria GT
dbCAN_3 0.9324 0.1249 0.8924 0.9723 Bacteria GT
dbCAN_3:HMMER 0.8482 0.1301 0.8066 0.8898 Bacteria GT
dbCAN_3:DIAMOND 0.9658 0.1219 0.9268 1.0048 Bacteria GT
dbCAN_3:eCAMI 0.8549 0.1621 0.8030 0.9067 Bacteria GT
dbCAN_4 0.9448 0.1259 0.9046 0.9851 Bacteria GT
dbCAN_4:HMMER 0.8493 0.1301 0.8077 0.8909 Bacteria GT
dbCAN_4:DIAMOND 0.9546 0.1262 0.9142 0.9949 Bacteria GT
dbCAN_4:dbCAN-sub 0.9445 0.1117 0.9088 0.9802 Bacteria GT
CUPP 0.8754 0.1137 0.8391 0.9118 Bacteria GT
dbCAN_2 0.8871 0.1379 0.8430 0.9312 Eukaryote GT
dbCAN_2:HMMER 0.8762 0.0901 0.8474 0.9051 Eukaryote GT
dbCAN_2:DIAMOND 0.9271 0.1449 0.8807 0.9734 Eukaryote GT
dbCAN_2:Hotpep 0.7723 0.1633 0.7201 0.8246 Eukaryote GT
dbCAN_3 0.9519 0.0576 0.9334 0.9703 Eukaryote GT
dbCAN_3:HMMER 0.8825 0.0870 0.8547 0.9103 Eukaryote GT
dbCAN_3:DIAMOND 0.9890 0.0339 0.9782 0.9998 Eukaryote GT
dbCAN_3:eCAMI 0.8451 0.1440 0.7991 0.8912 Eukaryote GT
dbCAN_4 0.9708 0.0317 0.9607 0.9809 Eukaryote GT
dbCAN_4:HMMER 0.8821 0.0866 0.8544 0.9098 Eukaryote GT
dbCAN_4:DIAMOND 0.9956 0.0118 0.9919 0.9994 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9631 0.0384 0.9508 0.9754 Eukaryote GT
CUPP 0.8317 0.1045 0.7983 0.8651 Eukaryote GT
dbCAN_2 0.8797 0.2421 0.8086 0.9508 All PL
dbCAN_2:HMMER 0.8975 0.2125 0.8351 0.9598 All PL
dbCAN_2:DIAMOND 0.8691 0.2687 0.7902 0.9480 All PL
dbCAN_2:Hotpep 0.8407 0.2581 0.7650 0.9165 All PL
dbCAN_3 0.9881 0.0732 0.9666 1.0096 All PL
dbCAN_3:HMMER 0.9739 0.0982 0.9451 1.0028 All PL
dbCAN_3:DIAMOND 0.9881 0.0732 0.9666 1.0096 All PL
dbCAN_3:eCAMI 0.7960 0.2776 0.7154 0.8766 All PL
dbCAN_4 0.9739 0.0982 0.9451 1.0028 All PL
dbCAN_4:HMMER 0.9739 0.0982 0.9451 1.0028 All PL
dbCAN_4:DIAMOND 0.9987 0.0086 0.9962 1.0013 All PL
dbCAN_4:dbCAN-sub 0.9752 0.0982 0.9464 1.0040 All PL
CUPP 0.8511 0.2593 0.7749 0.9272 All PL
dbCAN_2 0.9099 0.2101 0.8314 0.9884 Bacteria PL
dbCAN_2:HMMER 0.9210 0.1641 0.8597 0.9823 Bacteria PL
dbCAN_2:DIAMOND 0.9266 0.1958 0.8534 0.9997 Bacteria PL
dbCAN_2:Hotpep 0.8738 0.1959 0.8007 0.9470 Bacteria PL
dbCAN_3 0.9814 0.0915 0.9472 1.0156 Bacteria PL
dbCAN_3:HMMER 0.9592 0.1211 0.9139 1.0044 Bacteria PL
dbCAN_3:DIAMOND 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_3:eCAMI 0.8181 0.2466 0.7276 0.9085 Bacteria PL
dbCAN_4 0.9592 0.1211 0.9139 1.0044 Bacteria PL
dbCAN_4:HMMER 0.9592 0.1211 0.9139 1.0044 Bacteria PL
dbCAN_4:DIAMOND 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_4:dbCAN-sub 0.9611 0.1213 0.9158 1.0064 Bacteria PL
CUPP 0.8484 0.2520 0.7542 0.9425 Bacteria PL
dbCAN_2 0.8265 0.2895 0.6776 0.9753 Eukaryote PL
dbCAN_2:HMMER 0.8559 0.2794 0.7122 0.9996 Eukaryote PL
dbCAN_2:DIAMOND 0.7676 0.3477 0.5889 0.9464 Eukaryote PL
dbCAN_2:Hotpep 0.7824 0.3409 0.6071 0.9576 Eukaryote PL
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_3:DIAMOND 0.9706 0.1213 0.9082 1.0329 Eukaryote PL
dbCAN_3:eCAMI 0.7559 0.3311 0.5856 0.9261 Eukaryote PL
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_4:DIAMOND 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
CUPP 0.8559 0.2794 0.7122 0.9996 Eukaryote PL
dbCAN_2 0.9213 0.1433 0.8886 0.9540 All CE
dbCAN_2:HMMER 0.9208 0.1363 0.8896 0.9519 All CE
dbCAN_2:DIAMOND 0.8480 0.2469 0.7915 0.9044 All CE
dbCAN_2:Hotpep 0.8508 0.2065 0.8036 0.8980 All CE
dbCAN_3 0.9283 0.1537 0.8932 0.9634 All CE
dbCAN_3:HMMER 0.9230 0.1335 0.8925 0.9535 All CE
dbCAN_3:DIAMOND 0.9303 0.1704 0.8913 0.9692 All CE
dbCAN_3:eCAMI 0.8073 0.2445 0.7514 0.8632 All CE
dbCAN_4 0.9783 0.0669 0.9630 0.9935 All CE
dbCAN_4:HMMER 0.9529 0.1012 0.9298 0.9760 All CE
dbCAN_4:DIAMOND 0.9755 0.0851 0.9560 0.9949 All CE
dbCAN_4:dbCAN-sub 0.9755 0.0680 0.9599 0.9910 All CE
CUPP 0.9114 0.1332 0.8810 0.9419 All CE
dbCAN_2 0.9296 0.1184 0.8917 0.9675 Bacteria CE
dbCAN_2:HMMER 0.8801 0.1503 0.8320 0.9282 Bacteria CE
dbCAN_2:DIAMOND 0.8797 0.1688 0.8257 0.9337 Bacteria CE
dbCAN_2:Hotpep 0.9226 0.1251 0.8826 0.9627 Bacteria CE
dbCAN_3 0.9210 0.1569 0.8708 0.9712 Bacteria CE
dbCAN_3:HMMER 0.8801 0.1503 0.8320 0.9282 Bacteria CE
dbCAN_3:DIAMOND 0.9342 0.1346 0.8911 0.9772 Bacteria CE
dbCAN_3:eCAMI 0.8639 0.1908 0.8029 0.9250 Bacteria CE
dbCAN_4 0.9670 0.0836 0.9403 0.9937 Bacteria CE
dbCAN_4:HMMER 0.9245 0.1269 0.8839 0.9650 Bacteria CE
dbCAN_4:DIAMOND 0.9618 0.1104 0.9265 0.9971 Bacteria CE
dbCAN_4:dbCAN-sub 0.9634 0.0852 0.9362 0.9907 Bacteria CE
CUPP 0.8983 0.1322 0.8560 0.9405 Bacteria CE
dbCAN_2 0.9120 0.1679 0.8552 0.9689 Eukaryote CE
dbCAN_2:HMMER 0.9660 0.1032 0.9311 1.0009 Eukaryote CE
dbCAN_2:DIAMOND 0.8127 0.3106 0.7076 0.9178 Eukaryote CE
dbCAN_2:Hotpep 0.7710 0.2479 0.6871 0.8549 Eukaryote CE
dbCAN_3 0.9363 0.1518 0.8850 0.9877 Eukaryote CE
dbCAN_3:HMMER 0.9706 0.0926 0.9393 1.0019 Eukaryote CE
dbCAN_3:DIAMOND 0.9259 0.2049 0.8566 0.9952 Eukaryote CE
dbCAN_3:eCAMI 0.7444 0.2825 0.6488 0.8400 Eukaryote CE
dbCAN_4 0.9907 0.0387 0.9776 1.0038 Eukaryote CE
dbCAN_4:HMMER 0.9845 0.0455 0.9691 0.9999 Eukaryote CE
dbCAN_4:DIAMOND 0.9907 0.0387 0.9776 1.0038 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9888 0.0383 0.9759 1.0018 Eukaryote CE
CUPP 0.9260 0.1346 0.8805 0.9716 Eukaryote CE
dbCAN_2 0.9190 0.1152 0.8855 0.9524 All AA
dbCAN_2:HMMER 0.9375 0.0917 0.9109 0.9641 All AA
dbCAN_2:DIAMOND 0.8693 0.1786 0.8174 0.9212 All AA
dbCAN_2:Hotpep 0.8737 0.1942 0.8173 0.9301 All AA
dbCAN_3 0.9835 0.0450 0.9705 0.9966 All AA
dbCAN_3:HMMER 0.9880 0.0393 0.9766 0.9994 All AA
dbCAN_3:DIAMOND 0.9884 0.0359 0.9779 0.9988 All AA
dbCAN_3:eCAMI 0.7821 0.2611 0.7063 0.8579 All AA
dbCAN_4 0.9937 0.0258 0.9862 1.0012 All AA
dbCAN_4:HMMER 0.9880 0.0393 0.9766 0.9994 All AA
dbCAN_4:DIAMOND 0.9700 0.0952 0.9423 0.9976 All AA
dbCAN_4:dbCAN-sub 0.9892 0.0331 0.9796 0.9988 All AA
CUPP 0.9047 0.1300 0.8670 0.9425 All AA
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:Hotpep 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:eCAMI 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:DIAMOND 0.9545 0.1508 0.8533 1.0558 Bacteria AA
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria AA
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2 0.8949 0.1214 0.8544 0.9354 Eukaryote AA
dbCAN_2:HMMER 0.9189 0.0971 0.8865 0.9513 Eukaryote AA
dbCAN_2:DIAMOND 0.8305 0.1868 0.7682 0.8927 Eukaryote AA
dbCAN_2:Hotpep 0.8632 0.1533 0.8121 0.9143 Eukaryote AA
dbCAN_3 0.9786 0.0503 0.9619 0.9954 Eukaryote AA
dbCAN_3:HMMER 0.9844 0.0442 0.9697 0.9992 Eukaryote AA
dbCAN_3:DIAMOND 0.9849 0.0403 0.9715 0.9983 Eukaryote AA
dbCAN_3:eCAMI 0.7443 0.2395 0.6645 0.8242 Eukaryote AA
dbCAN_4 0.9918 0.0292 0.9821 1.0016 Eukaryote AA
dbCAN_4:HMMER 0.9844 0.0442 0.9697 0.9992 Eukaryote AA
dbCAN_4:DIAMOND 0.9746 0.0737 0.9500 0.9992 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9860 0.0372 0.9736 0.9984 Eukaryote AA
CUPP 0.8764 0.1359 0.8311 0.9217 Eukaryote AA
dbCAN_2 0.7242 0.1925 0.6813 0.7670 All CBM
dbCAN_2:HMMER 0.4129 0.2165 0.3648 0.4611 All CBM
dbCAN_2:DIAMOND 0.7854 0.2191 0.7367 0.8342 All CBM
dbCAN_2:Hotpep 0.7061 0.2102 0.6594 0.7529 All CBM
dbCAN_3 0.7680 0.1926 0.7252 0.8109 All CBM
dbCAN_3:HMMER 0.4135 0.2148 0.3657 0.4613 All CBM
dbCAN_3:DIAMOND 0.8675 0.1731 0.8290 0.9060 All CBM
dbCAN_3:eCAMI 0.7346 0.2203 0.6856 0.7836 All CBM
dbCAN_4 0.7995 0.1921 0.7567 0.8422 All CBM
dbCAN_4:HMMER 0.4740 0.2360 0.4215 0.5266 All CBM
dbCAN_4:DIAMOND 0.8773 0.1695 0.8396 0.9151 All CBM
dbCAN_4:dbCAN-sub 0.8193 0.1995 0.7749 0.8637 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.7317 0.2003 0.6676 0.7957 Bacteria CBM
dbCAN_2:HMMER 0.5107 0.2276 0.4379 0.5834 Bacteria CBM
dbCAN_2:DIAMOND 0.7590 0.2212 0.6883 0.8297 Bacteria CBM
dbCAN_2:Hotpep 0.7044 0.2472 0.6253 0.7835 Bacteria CBM
dbCAN_3 0.7630 0.2112 0.6954 0.8305 Bacteria CBM
dbCAN_3:HMMER 0.5092 0.2267 0.4367 0.5817 Bacteria CBM
dbCAN_3:DIAMOND 0.8023 0.1976 0.7391 0.8654 Bacteria CBM
dbCAN_3:eCAMI 0.7427 0.2415 0.6655 0.8199 Bacteria CBM
dbCAN_4 0.9025 0.1017 0.8699 0.9350 Bacteria CBM
dbCAN_4:HMMER 0.6086 0.2286 0.5355 0.6817 Bacteria CBM
dbCAN_4:DIAMOND 0.7955 0.1993 0.7318 0.8592 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9441 0.0985 0.9126 0.9756 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.7167 0.1866 0.6570 0.7763 Eukaryote CBM
dbCAN_2:HMMER 0.3152 0.1537 0.2661 0.3644 Eukaryote CBM
dbCAN_2:DIAMOND 0.8119 0.2166 0.7426 0.8811 Eukaryote CBM
dbCAN_2:Hotpep 0.7079 0.1684 0.6540 0.7617 Eukaryote CBM
dbCAN_3 0.7730 0.1746 0.7172 0.8289 Eukaryote CBM
dbCAN_3:HMMER 0.3177 0.1525 0.2690 0.3665 Eukaryote CBM
dbCAN_3:DIAMOND 0.9327 0.1137 0.8964 0.9691 Eukaryote CBM
dbCAN_3:eCAMI 0.7264 0.1996 0.6626 0.7903 Eukaryote CBM
dbCAN_4 0.6965 0.2066 0.6305 0.7626 Eukaryote CBM
dbCAN_4:HMMER 0.3395 0.1530 0.2906 0.3884 Eukaryote CBM
dbCAN_4:DIAMOND 0.9592 0.0692 0.9370 0.9813 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.6945 0.1974 0.6314 0.7577 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

8.2.3 Precision

Table 8.8: Overall performance (represented by the Precision) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9886 0.0347 0.9809 0.9963 All GH
dbCAN_2:HMMER 0.9897 0.0339 0.9821 0.9972 All GH
dbCAN_2:DIAMOND 0.9845 0.0360 0.9765 0.9926 All GH
dbCAN_2:Hotpep 0.9796 0.0419 0.9702 0.9889 All GH
dbCAN_3 0.9897 0.0332 0.9823 0.9971 All GH
dbCAN_3:HMMER 0.9909 0.0330 0.9836 0.9983 All GH
dbCAN_3:DIAMOND 0.9830 0.0391 0.9743 0.9917 All GH
dbCAN_3:eCAMI 0.9829 0.0395 0.9741 0.9917 All GH
dbCAN_4 0.9898 0.0337 0.9823 0.9973 All GH
dbCAN_4:HMMER 0.9909 0.0330 0.9836 0.9983 All GH
dbCAN_4:DIAMOND 0.9842 0.0393 0.9754 0.9929 All GH
dbCAN_4:dbCAN-sub 0.9896 0.0337 0.9821 0.9971 All GH
CUPP 0.9906 0.0338 0.9831 0.9981 All GH
dbCAN_2 0.9908 0.0219 0.9838 0.9978 Bacteria GH
dbCAN_2:HMMER 0.9935 0.0153 0.9886 0.9984 Bacteria GH
dbCAN_2:DIAMOND 0.9841 0.0265 0.9757 0.9926 Bacteria GH
dbCAN_2:Hotpep 0.9736 0.0369 0.9618 0.9854 Bacteria GH
dbCAN_3 0.9917 0.0191 0.9856 0.9978 Bacteria GH
dbCAN_3:HMMER 0.9945 0.0142 0.9900 0.9991 Bacteria GH
dbCAN_3:DIAMOND 0.9858 0.0247 0.9779 0.9937 Bacteria GH
dbCAN_3:eCAMI 0.9805 0.0274 0.9718 0.9893 Bacteria GH
dbCAN_4 0.9923 0.0176 0.9867 0.9980 Bacteria GH
dbCAN_4:HMMER 0.9945 0.0142 0.9900 0.9991 Bacteria GH
dbCAN_4:DIAMOND 0.9886 0.0244 0.9808 0.9964 Bacteria GH
dbCAN_4:dbCAN-sub 0.9920 0.0175 0.9864 0.9976 Bacteria GH
CUPP 0.9931 0.0143 0.9886 0.9977 Bacteria GH
dbCAN_2 0.9863 0.0442 0.9722 1.0005 Eukaryote GH
dbCAN_2:HMMER 0.9859 0.0454 0.9714 1.0004 Eukaryote GH
dbCAN_2:DIAMOND 0.9850 0.0439 0.9709 0.9990 Eukaryote GH
dbCAN_2:Hotpep 0.9855 0.0461 0.9708 1.0003 Eukaryote GH
dbCAN_3 0.9877 0.0432 0.9739 1.0015 Eukaryote GH
dbCAN_3:HMMER 0.9873 0.0444 0.9731 1.0015 Eukaryote GH
dbCAN_3:DIAMOND 0.9801 0.0497 0.9642 0.9960 Eukaryote GH
dbCAN_3:eCAMI 0.9852 0.0489 0.9695 1.0008 Eukaryote GH
dbCAN_4 0.9873 0.0445 0.9730 1.0015 Eukaryote GH
dbCAN_4:HMMER 0.9873 0.0444 0.9731 1.0015 Eukaryote GH
dbCAN_4:DIAMOND 0.9797 0.0499 0.9638 0.9957 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9872 0.0445 0.9730 1.0015 Eukaryote GH
CUPP 0.9880 0.0458 0.9734 1.0027 Eukaryote GH
dbCAN_2 0.9898 0.0542 0.9778 1.0019 All GT
dbCAN_2:HMMER 0.9884 0.0566 0.9758 1.0010 All GT
dbCAN_2:DIAMOND 0.9886 0.0567 0.9760 1.0013 All GT
dbCAN_2:Hotpep 0.9836 0.0688 0.9683 0.9989 All GT
dbCAN_3 0.9891 0.0563 0.9766 1.0016 All GT
dbCAN_3:HMMER 0.9884 0.0566 0.9758 1.0010 All GT
dbCAN_3:DIAMOND 0.9839 0.0620 0.9701 0.9977 All GT
dbCAN_3:eCAMI 0.9881 0.0572 0.9754 1.0009 All GT
dbCAN_4 0.9866 0.0566 0.9740 0.9992 All GT
dbCAN_4:HMMER 0.9884 0.0566 0.9758 1.0010 All GT
dbCAN_4:DIAMOND 0.9841 0.0638 0.9699 0.9983 All GT
dbCAN_4:dbCAN-sub 0.9866 0.0566 0.9740 0.9992 All GT
CUPP 0.9883 0.0625 0.9744 1.0022 All GT
dbCAN_2 0.9987 0.0058 0.9969 1.0006 Bacteria GT
dbCAN_2:HMMER 0.9988 0.0055 0.9970 1.0005 Bacteria GT
dbCAN_2:DIAMOND 0.9984 0.0069 0.9962 1.0007 Bacteria GT
dbCAN_2:Hotpep 0.9908 0.0250 0.9828 0.9988 Bacteria GT
dbCAN_3 0.9988 0.0053 0.9971 1.0005 Bacteria GT
dbCAN_3:HMMER 0.9988 0.0054 0.9971 1.0005 Bacteria GT
dbCAN_3:DIAMOND 0.9975 0.0088 0.9947 1.0004 Bacteria GT
dbCAN_3:eCAMI 0.9963 0.0112 0.9928 0.9999 Bacteria GT
dbCAN_4 0.9980 0.0074 0.9957 1.0004 Bacteria GT
dbCAN_4:HMMER 0.9988 0.0054 0.9971 1.0005 Bacteria GT
dbCAN_4:DIAMOND 0.9975 0.0091 0.9945 1.0004 Bacteria GT
dbCAN_4:dbCAN-sub 0.9980 0.0074 0.9957 1.0004 Bacteria GT
CUPP 0.9995 0.0030 0.9985 1.0005 Bacteria GT
dbCAN_2 0.9810 0.0759 0.9567 1.0052 Eukaryote GT
dbCAN_2:HMMER 0.9781 0.0790 0.9528 1.0033 Eukaryote GT
dbCAN_2:DIAMOND 0.9788 0.0792 0.9535 1.0042 Eukaryote GT
dbCAN_2:Hotpep 0.9764 0.0942 0.9463 1.0066 Eukaryote GT
dbCAN_3 0.9794 0.0787 0.9542 1.0045 Eukaryote GT
dbCAN_3:HMMER 0.9781 0.0790 0.9528 1.0033 Eukaryote GT
dbCAN_3:DIAMOND 0.9702 0.0856 0.9429 0.9976 Eukaryote GT
dbCAN_3:eCAMI 0.9799 0.0798 0.9544 1.0054 Eukaryote GT
dbCAN_4 0.9752 0.0785 0.9501 1.0003 Eukaryote GT
dbCAN_4:HMMER 0.9781 0.0790 0.9528 1.0033 Eukaryote GT
dbCAN_4:DIAMOND 0.9707 0.0883 0.9425 0.9990 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9752 0.0785 0.9501 1.0003 Eukaryote GT
CUPP 0.9771 0.0874 0.9491 1.0050 Eukaryote GT
dbCAN_2 0.9532 0.2052 0.8929 1.0134 All PL
dbCAN_2:HMMER 0.9745 0.1481 0.9310 1.0180 All PL
dbCAN_2:DIAMOND 0.9248 0.2505 0.8513 0.9984 All PL
dbCAN_2:Hotpep 0.9506 0.2050 0.8904 1.0108 All PL
dbCAN_3 0.9846 0.0766 0.9621 1.0071 All PL
dbCAN_3:HMMER 0.9846 0.0766 0.9621 1.0071 All PL
dbCAN_3:DIAMOND 0.9730 0.1012 0.9433 1.0028 All PL
dbCAN_3:eCAMI 0.9333 0.2452 0.8621 1.0045 All PL
dbCAN_4 0.9846 0.0766 0.9621 1.0071 All PL
dbCAN_4:HMMER 0.9846 0.0766 0.9621 1.0071 All PL
dbCAN_4:DIAMOND 0.9730 0.1012 0.9433 1.0028 All PL
dbCAN_4:dbCAN-sub 0.9846 0.0766 0.9621 1.0071 All PL
CUPP 0.9496 0.2058 0.8892 1.0101 All PL
dbCAN_2 0.9667 0.1826 0.8985 1.0348 Bacteria PL
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_2:DIAMOND 0.9556 0.1904 0.8844 1.0267 Bacteria PL
dbCAN_2:Hotpep 0.9960 0.0154 0.9902 1.0017 Bacteria PL
dbCAN_3 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_3:HMMER 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_3:DIAMOND 0.9869 0.0614 0.9640 1.0099 Bacteria PL
dbCAN_3:eCAMI 0.9677 0.1796 0.9019 1.0336 Bacteria PL
dbCAN_4 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_4:HMMER 0.9980 0.0107 0.9940 1.0020 Bacteria PL
dbCAN_4:DIAMOND 0.9869 0.0614 0.9640 1.0099 Bacteria PL
dbCAN_4:dbCAN-sub 0.9980 0.0107 0.9940 1.0020 Bacteria PL
CUPP 0.9667 0.1826 0.8985 1.0348 Bacteria PL
dbCAN_2 0.9294 0.2443 0.8038 1.0550 Eukaryote PL
dbCAN_2:HMMER 0.9294 0.2443 0.8038 1.0550 Eukaryote PL
dbCAN_2:DIAMOND 0.8706 0.3312 0.7003 1.0409 Eukaryote PL
dbCAN_2:Hotpep 0.8706 0.3312 0.7003 1.0409 Eukaryote PL
dbCAN_3 0.9608 0.1254 0.8963 1.0253 Eukaryote PL
dbCAN_3:HMMER 0.9608 0.1254 0.8963 1.0253 Eukaryote PL
dbCAN_3:DIAMOND 0.9485 0.1470 0.8730 1.0241 Eukaryote PL
dbCAN_3:eCAMI 0.8706 0.3312 0.7003 1.0409 Eukaryote PL
dbCAN_4 0.9608 0.1254 0.8963 1.0253 Eukaryote PL
dbCAN_4:HMMER 0.9608 0.1254 0.8963 1.0253 Eukaryote PL
dbCAN_4:DIAMOND 0.9485 0.1470 0.8730 1.0241 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9608 0.1254 0.8963 1.0253 Eukaryote PL
CUPP 0.9196 0.2447 0.7938 1.0454 Eukaryote PL
dbCAN_2 0.9519 0.1408 0.9197 0.9840 All CE
dbCAN_2:HMMER 0.9525 0.1191 0.9253 0.9797 All CE
dbCAN_2:DIAMOND 0.9280 0.2089 0.8803 0.9757 All CE
dbCAN_2:Hotpep 0.9128 0.1671 0.8746 0.9510 All CE
dbCAN_3 0.9526 0.1391 0.9208 0.9844 All CE
dbCAN_3:HMMER 0.9602 0.1103 0.9350 0.9854 All CE
dbCAN_3:DIAMOND 0.9304 0.1864 0.8878 0.9730 All CE
dbCAN_3:eCAMI 0.9156 0.1909 0.8719 0.9592 All CE
dbCAN_4 0.9603 0.1103 0.9351 0.9855 All CE
dbCAN_4:HMMER 0.9603 0.1103 0.9351 0.9855 All CE
dbCAN_4:DIAMOND 0.9476 0.1506 0.9132 0.9820 All CE
dbCAN_4:dbCAN-sub 0.9592 0.1120 0.9336 0.9848 All CE
CUPP 0.9598 0.1130 0.9339 0.9856 All CE
dbCAN_2 0.9148 0.1833 0.8562 0.9734 Bacteria CE
dbCAN_2:HMMER 0.9218 0.1527 0.8729 0.9706 Bacteria CE
dbCAN_2:DIAMOND 0.9182 0.1867 0.8585 0.9779 Bacteria CE
dbCAN_2:Hotpep 0.8560 0.2018 0.7914 0.9205 Bacteria CE
dbCAN_3 0.9149 0.1821 0.8567 0.9732 Bacteria CE
dbCAN_3:HMMER 0.9322 0.1429 0.8865 0.9779 Bacteria CE
dbCAN_3:DIAMOND 0.9219 0.1752 0.8658 0.9779 Bacteria CE
dbCAN_3:eCAMI 0.8681 0.2000 0.8042 0.9321 Bacteria CE
dbCAN_4 0.9324 0.1429 0.8867 0.9781 Bacteria CE
dbCAN_4:HMMER 0.9324 0.1429 0.8867 0.9781 Bacteria CE
dbCAN_4:DIAMOND 0.9295 0.1739 0.8739 0.9851 Bacteria CE
dbCAN_4:dbCAN-sub 0.9321 0.1429 0.8864 0.9778 Bacteria CE
CUPP 0.9298 0.1452 0.8834 0.9762 Bacteria CE
dbCAN_2 0.9931 0.0417 0.9790 1.0072 Eukaryote CE
dbCAN_2:HMMER 0.9867 0.0459 0.9712 1.0023 Eukaryote CE
dbCAN_2:DIAMOND 0.9389 0.2333 0.8599 1.0178 Eukaryote CE
dbCAN_2:Hotpep 0.9759 0.0818 0.9482 1.0035 Eukaryote CE
dbCAN_3 0.9944 0.0333 0.9832 1.0057 Eukaryote CE
dbCAN_3:HMMER 0.9914 0.0377 0.9786 1.0041 Eukaryote CE
dbCAN_3:DIAMOND 0.9399 0.2002 0.8722 1.0076 Eukaryote CE
dbCAN_3:eCAMI 0.9683 0.1677 0.9115 1.0250 Eukaryote CE
dbCAN_4 0.9914 0.0377 0.9786 1.0041 Eukaryote CE
dbCAN_4:HMMER 0.9914 0.0377 0.9786 1.0041 Eukaryote CE
dbCAN_4:DIAMOND 0.9677 0.1189 0.9275 1.0079 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9893 0.0486 0.9729 1.0058 Eukaryote CE
CUPP 0.9931 0.0417 0.9790 1.0072 Eukaryote CE
dbCAN_2 0.9208 0.1693 0.8716 0.9699 All AA
dbCAN_2:HMMER 0.9191 0.1607 0.8724 0.9658 All AA
dbCAN_2:DIAMOND 0.9228 0.1710 0.8731 0.9724 All AA
dbCAN_2:Hotpep 0.8983 0.2166 0.8354 0.9612 All AA
dbCAN_3 0.9236 0.1605 0.8770 0.9702 All AA
dbCAN_3:HMMER 0.9192 0.1624 0.8720 0.9664 All AA
dbCAN_3:DIAMOND 0.9171 0.1693 0.8679 0.9662 All AA
dbCAN_3:eCAMI 0.8514 0.2809 0.7699 0.9330 All AA
dbCAN_4 0.9198 0.1610 0.8731 0.9666 All AA
dbCAN_4:HMMER 0.9192 0.1624 0.8720 0.9664 All AA
dbCAN_4:DIAMOND 0.9196 0.1697 0.8703 0.9689 All AA
dbCAN_4:dbCAN-sub 0.9177 0.1620 0.8707 0.9648 All AA
CUPP 0.9184 0.1697 0.8691 0.9677 All AA
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:Hotpep 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:eCAMI 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria AA
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2 0.8972 0.1869 0.8349 0.9595 Eukaryote AA
dbCAN_2:HMMER 0.8950 0.1764 0.8362 0.9539 Eukaryote AA
dbCAN_2:DIAMOND 0.8998 0.1892 0.8367 0.9629 Eukaryote AA
dbCAN_2:Hotpep 0.8951 0.1896 0.8319 0.9583 Eukaryote AA
dbCAN_3 0.9009 0.1770 0.8419 0.9599 Eukaryote AA
dbCAN_3:HMMER 0.8952 0.1785 0.8357 0.9547 Eukaryote AA
dbCAN_3:DIAMOND 0.8925 0.1862 0.8304 0.9545 Eukaryote AA
dbCAN_3:eCAMI 0.8343 0.2765 0.7421 0.9264 Eukaryote AA
dbCAN_4 0.8960 0.1769 0.8370 0.9550 Eukaryote AA
dbCAN_4:HMMER 0.8952 0.1785 0.8357 0.9547 Eukaryote AA
dbCAN_4:DIAMOND 0.8957 0.1872 0.8333 0.9581 Eukaryote AA
dbCAN_4:dbCAN-sub 0.8932 0.1777 0.8340 0.9525 Eukaryote AA
CUPP 0.8942 0.1870 0.8318 0.9565 Eukaryote AA
dbCAN_2 0.9203 0.1333 0.8906 0.9500 All CBM
dbCAN_2:HMMER 0.8965 0.2304 0.8452 0.9478 All CBM
dbCAN_2:DIAMOND 0.9219 0.1502 0.8885 0.9553 All CBM
dbCAN_2:Hotpep 0.4696 0.1731 0.4311 0.5081 All CBM
dbCAN_3 0.9431 0.1355 0.9129 0.9732 All CBM
dbCAN_3:HMMER 0.8964 0.2304 0.8451 0.9476 All CBM
dbCAN_3:DIAMOND 0.9512 0.1283 0.9227 0.9797 All CBM
dbCAN_3:eCAMI 0.6615 0.2254 0.6113 0.7117 All CBM
dbCAN_4 0.9562 0.0919 0.9358 0.9767 All CBM
dbCAN_4:HMMER 0.9184 0.1999 0.8739 0.9629 All CBM
dbCAN_4:DIAMOND 0.9571 0.1273 0.9288 0.9854 All CBM
dbCAN_4:dbCAN-sub 0.9376 0.1151 0.9120 0.9632 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.9517 0.1091 0.9168 0.9866 Bacteria CBM
dbCAN_2:HMMER 0.9380 0.1365 0.8944 0.9817 Bacteria CBM
dbCAN_2:DIAMOND 0.9551 0.1626 0.9031 1.0071 Bacteria CBM
dbCAN_2:Hotpep 0.4795 0.2000 0.4156 0.5435 Bacteria CBM
dbCAN_3 0.9384 0.1704 0.8839 0.9929 Bacteria CBM
dbCAN_3:HMMER 0.9377 0.1365 0.8941 0.9814 Bacteria CBM
dbCAN_3:DIAMOND 0.9600 0.1592 0.9091 1.0109 Bacteria CBM
dbCAN_3:eCAMI 0.6317 0.2308 0.5579 0.7055 Bacteria CBM
dbCAN_4 0.9699 0.0662 0.9487 0.9911 Bacteria CBM
dbCAN_4:HMMER 0.9524 0.1130 0.9163 0.9886 Bacteria CBM
dbCAN_4:DIAMOND 0.9604 0.1591 0.9095 1.0113 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9624 0.0681 0.9406 0.9842 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.8889 0.1486 0.8414 0.9364 Eukaryote CBM
dbCAN_2:HMMER 0.8550 0.2922 0.7616 0.9484 Eukaryote CBM
dbCAN_2:DIAMOND 0.8887 0.1303 0.8471 0.9304 Eukaryote CBM
dbCAN_2:Hotpep 0.4597 0.1431 0.4139 0.5055 Eukaryote CBM
dbCAN_3 0.9477 0.0901 0.9189 0.9765 Eukaryote CBM
dbCAN_3:HMMER 0.8550 0.2922 0.7616 0.9484 Eukaryote CBM
dbCAN_3:DIAMOND 0.9424 0.0885 0.9141 0.9707 Eukaryote CBM
dbCAN_3:eCAMI 0.6913 0.2188 0.6214 0.7613 Eukaryote CBM
dbCAN_4 0.9426 0.1111 0.9071 0.9781 Eukaryote CBM
dbCAN_4:HMMER 0.8844 0.2565 0.8023 0.9664 Eukaryote CBM
dbCAN_4:DIAMOND 0.9538 0.0865 0.9261 0.9815 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.9128 0.1447 0.8665 0.9591 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

8.2.4 F1-score

Table 8.9: Overall performance (represented by the F1-score) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9583 0.0614 0.9446 0.9719 All GH
dbCAN_2:HMMER 0.9444 0.0587 0.9313 0.9574 All GH
dbCAN_2:DIAMOND 0.9563 0.0727 0.9401 0.9725 All GH
dbCAN_2:Hotpep 0.9136 0.0773 0.8964 0.9308 All GH
dbCAN_3 0.9708 0.0558 0.9583 0.9832 All GH
dbCAN_3:HMMER 0.9514 0.0596 0.9382 0.9647 All GH
dbCAN_3:DIAMOND 0.9776 0.0491 0.9666 0.9885 All GH
dbCAN_3:eCAMI 0.9223 0.0695 0.9069 0.9378 All GH
dbCAN_4 0.9671 0.0590 0.9539 0.9802 All GH
dbCAN_4:HMMER 0.9516 0.0596 0.9384 0.9649 All GH
dbCAN_4:DIAMOND 0.9764 0.0583 0.9634 0.9894 All GH
dbCAN_4:dbCAN-sub 0.9655 0.0581 0.9526 0.9785 All GH
CUPP 0.9461 0.0454 0.9360 0.9562 All GH
dbCAN_2 0.9592 0.0692 0.9371 0.9814 Bacteria GH
dbCAN_2:HMMER 0.9464 0.0769 0.9218 0.9710 Bacteria GH
dbCAN_2:DIAMOND 0.9535 0.0809 0.9277 0.9794 Bacteria GH
dbCAN_2:Hotpep 0.9308 0.0630 0.9106 0.9509 Bacteria GH
dbCAN_3 0.9718 0.0713 0.9490 0.9947 Bacteria GH
dbCAN_3:HMMER 0.9495 0.0782 0.9245 0.9745 Bacteria GH
dbCAN_3:DIAMOND 0.9723 0.0614 0.9527 0.9919 Bacteria GH
dbCAN_3:eCAMI 0.9423 0.0641 0.9218 0.9628 Bacteria GH
dbCAN_4 0.9682 0.0776 0.9434 0.9930 Bacteria GH
dbCAN_4:HMMER 0.9499 0.0782 0.9249 0.9749 Bacteria GH
dbCAN_4:DIAMOND 0.9694 0.0764 0.9449 0.9938 Bacteria GH
dbCAN_4:dbCAN-sub 0.9690 0.0761 0.9447 0.9933 Bacteria GH
CUPP 0.9516 0.0487 0.9360 0.9672 Bacteria GH
dbCAN_2 0.9573 0.0533 0.9403 0.9744 Eukaryote GH
dbCAN_2:HMMER 0.9423 0.0327 0.9319 0.9528 Eukaryote GH
dbCAN_2:DIAMOND 0.9590 0.0644 0.9384 0.9796 Eukaryote GH
dbCAN_2:Hotpep 0.8965 0.0868 0.8687 0.9242 Eukaryote GH
dbCAN_3 0.9697 0.0350 0.9585 0.9809 Eukaryote GH
dbCAN_3:HMMER 0.9534 0.0329 0.9429 0.9639 Eukaryote GH
dbCAN_3:DIAMOND 0.9828 0.0327 0.9724 0.9933 Eukaryote GH
dbCAN_3:eCAMI 0.9023 0.0698 0.8800 0.9247 Eukaryote GH
dbCAN_4 0.9659 0.0319 0.9557 0.9761 Eukaryote GH
dbCAN_4:HMMER 0.9534 0.0329 0.9429 0.9639 Eukaryote GH
dbCAN_4:DIAMOND 0.9834 0.0307 0.9736 0.9932 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9621 0.0320 0.9519 0.9723 Eukaryote GH
CUPP 0.9405 0.0416 0.9272 0.9538 Eukaryote GH
dbCAN_2 0.9258 0.1016 0.9032 0.9484 All GT
dbCAN_2:HMMER 0.9152 0.0845 0.8964 0.9340 All GT
dbCAN_2:DIAMOND 0.9463 0.1124 0.9213 0.9713 All GT
dbCAN_2:Hotpep 0.8209 0.1383 0.7901 0.8517 All GT
dbCAN_3 0.9606 0.0790 0.9430 0.9782 All GT
dbCAN_3:HMMER 0.9169 0.0839 0.8982 0.9355 All GT
dbCAN_3:DIAMOND 0.9764 0.0755 0.9596 0.9932 All GT
dbCAN_3:eCAMI 0.9046 0.1076 0.8806 0.9285 All GT
dbCAN_4 0.9677 0.0753 0.9510 0.9845 All GT
dbCAN_4:HMMER 0.9170 0.0838 0.8984 0.9357 All GT
dbCAN_4:DIAMOND 0.9750 0.0781 0.9577 0.9924 All GT
dbCAN_4:dbCAN-sub 0.9664 0.0656 0.9518 0.9810 All GT
CUPP 0.9107 0.0758 0.8938 0.9276 All GT
dbCAN_2 0.9294 0.1020 0.8968 0.9620 Bacteria GT
dbCAN_2:HMMER 0.9110 0.1012 0.8786 0.9434 Bacteria GT
dbCAN_2:DIAMOND 0.9506 0.1147 0.9139 0.9873 Bacteria GT
dbCAN_2:Hotpep 0.7903 0.1418 0.7449 0.8357 Bacteria GT
dbCAN_3 0.9584 0.0970 0.9274 0.9894 Bacteria GT
dbCAN_3:HMMER 0.9106 0.1007 0.8784 0.9428 Bacteria GT
dbCAN_3:DIAMOND 0.9758 0.0925 0.9462 1.0054 Bacteria GT
dbCAN_3:eCAMI 0.9112 0.1075 0.8768 0.9455 Bacteria GT
dbCAN_4 0.9647 0.0958 0.9340 0.9953 Bacteria GT
dbCAN_4:HMMER 0.9112 0.1007 0.8790 0.9434 Bacteria GT
dbCAN_4:DIAMOND 0.9695 0.0962 0.9387 1.0003 Bacteria GT
dbCAN_4:dbCAN-sub 0.9660 0.0801 0.9404 0.9917 Bacteria GT
CUPP 0.9292 0.0713 0.9064 0.9520 Bacteria GT
dbCAN_2 0.9222 0.1023 0.8895 0.9550 Eukaryote GT
dbCAN_2:HMMER 0.9194 0.0648 0.8987 0.9401 Eukaryote GT
dbCAN_2:DIAMOND 0.9420 0.1112 0.9064 0.9776 Eukaryote GT
dbCAN_2:Hotpep 0.8515 0.1293 0.8101 0.8928 Eukaryote GT
dbCAN_3 0.9628 0.0569 0.9446 0.9810 Eukaryote GT
dbCAN_3:HMMER 0.9231 0.0636 0.9028 0.9435 Eukaryote GT
dbCAN_3:DIAMOND 0.9769 0.0547 0.9595 0.9944 Eukaryote GT
dbCAN_3:eCAMI 0.8980 0.1085 0.8633 0.9327 Eukaryote GT
dbCAN_4 0.9708 0.0479 0.9555 0.9861 Eukaryote GT
dbCAN_4:HMMER 0.9229 0.0634 0.9026 0.9432 Eukaryote GT
dbCAN_4:DIAMOND 0.9806 0.0551 0.9630 0.9982 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9667 0.0481 0.9513 0.9821 Eukaryote GT
CUPP 0.8923 0.0766 0.8678 0.9168 Eukaryote GT
dbCAN_2 0.9073 0.2183 0.8432 0.9714 All PL
dbCAN_2:HMMER 0.9250 0.1788 0.8725 0.9775 All PL
dbCAN_2:DIAMOND 0.8889 0.2530 0.8146 0.9632 All PL
dbCAN_2:Hotpep 0.8803 0.2257 0.8140 0.9465 All PL
dbCAN_3 0.9826 0.0689 0.9624 1.0028 All PL
dbCAN_3:HMMER 0.9741 0.0781 0.9511 0.9970 All PL
dbCAN_3:DIAMOND 0.9754 0.0792 0.9521 0.9986 All PL
dbCAN_3:eCAMI 0.8473 0.2533 0.7737 0.9208 All PL
dbCAN_4 0.9741 0.0781 0.9511 0.9970 All PL
dbCAN_4:HMMER 0.9741 0.0781 0.9511 0.9970 All PL
dbCAN_4:DIAMOND 0.9825 0.0646 0.9635 1.0014 All PL
dbCAN_4:dbCAN-sub 0.9747 0.0782 0.9518 0.9977 All PL
CUPP 0.8850 0.2288 0.8178 0.9522 All PL
dbCAN_2 0.9329 0.1918 0.8612 1.0045 Bacteria PL
dbCAN_2:HMMER 0.9495 0.1139 0.9070 0.9921 Bacteria PL
dbCAN_2:DIAMOND 0.9373 0.1869 0.8675 1.0071 Bacteria PL
dbCAN_2:Hotpep 0.9175 0.1327 0.8679 0.9670 Bacteria PL
dbCAN_3 0.9869 0.0610 0.9641 1.0096 Bacteria PL
dbCAN_3:HMMER 0.9735 0.0770 0.9448 1.0023 Bacteria PL
dbCAN_3:DIAMOND 0.9913 0.0369 0.9775 1.0051 Bacteria PL
dbCAN_3:eCAMI 0.8734 0.2083 0.7970 0.9498 Bacteria PL
dbCAN_4 0.9735 0.0770 0.9448 1.0023 Bacteria PL
dbCAN_4:HMMER 0.9735 0.0770 0.9448 1.0023 Bacteria PL
dbCAN_4:DIAMOND 0.9913 0.0369 0.9775 1.0051 Bacteria PL
dbCAN_4:dbCAN-sub 0.9745 0.0771 0.9457 1.0034 Bacteria PL
CUPP 0.8899 0.2178 0.8085 0.9712 Bacteria PL
dbCAN_2 0.8622 0.2587 0.7292 0.9952 Eukaryote PL
dbCAN_2:HMMER 0.8818 0.2556 0.7504 1.0132 Eukaryote PL
dbCAN_2:DIAMOND 0.8034 0.3294 0.6340 0.9727 Eukaryote PL
dbCAN_2:Hotpep 0.8146 0.3277 0.6461 0.9831 Eukaryote PL
dbCAN_3 0.9750 0.0825 0.9326 1.0174 Eukaryote PL
dbCAN_3:HMMER 0.9750 0.0825 0.9326 1.0174 Eukaryote PL
dbCAN_3:DIAMOND 0.9472 0.1194 0.8858 1.0086 Eukaryote PL
dbCAN_3:eCAMI 0.7996 0.3214 0.6344 0.9649 Eukaryote PL
dbCAN_4 0.9750 0.0825 0.9326 1.0174 Eukaryote PL
dbCAN_4:HMMER 0.9750 0.0825 0.9326 1.0174 Eukaryote PL
dbCAN_4:DIAMOND 0.9668 0.0954 0.9178 1.0159 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9750 0.0825 0.9326 1.0174 Eukaryote PL
CUPP 0.8764 0.2539 0.7459 1.0070 Eukaryote PL
dbCAN_2 0.9224 0.1305 0.8926 0.9522 All CE
dbCAN_2:HMMER 0.9248 0.1119 0.8993 0.9504 All CE
dbCAN_2:DIAMOND 0.8636 0.2234 0.8126 0.9147 All CE
dbCAN_2:Hotpep 0.8555 0.1705 0.8165 0.8944 All CE
dbCAN_3 0.9236 0.1397 0.8916 0.9555 All CE
dbCAN_3:HMMER 0.9302 0.1073 0.9057 0.9547 All CE
dbCAN_3:DIAMOND 0.9152 0.1682 0.8768 0.9537 All CE
dbCAN_3:eCAMI 0.8314 0.2020 0.7852 0.8775 All CE
dbCAN_4 0.9638 0.0854 0.9443 0.9833 All CE
dbCAN_4:HMMER 0.9487 0.0957 0.9268 0.9706 All CE
dbCAN_4:DIAMOND 0.9507 0.1192 0.9234 0.9779 All CE
dbCAN_4:dbCAN-sub 0.9617 0.0863 0.9420 0.9814 All CE
CUPP 0.9250 0.1095 0.9000 0.9500 All CE
dbCAN_2 0.9052 0.1414 0.8600 0.9505 Bacteria CE
dbCAN_2:HMMER 0.8823 0.1265 0.8419 0.9228 Bacteria CE
dbCAN_2:DIAMOND 0.8759 0.1626 0.8239 0.9279 Bacteria CE
dbCAN_2:Hotpep 0.8730 0.1582 0.8224 0.9236 Bacteria CE
dbCAN_3 0.8941 0.1625 0.8421 0.9460 Bacteria CE
dbCAN_3:HMMER 0.8875 0.1215 0.8486 0.9264 Bacteria CE
dbCAN_3:DIAMOND 0.9092 0.1444 0.8630 0.9554 Bacteria CE
dbCAN_3:eCAMI 0.8444 0.1781 0.7874 0.9013 Bacteria CE
dbCAN_4 0.9400 0.1099 0.9049 0.9751 Bacteria CE
dbCAN_4:HMMER 0.9143 0.1191 0.8762 0.9524 Bacteria CE
dbCAN_4:DIAMOND 0.9299 0.1425 0.8843 0.9754 Bacteria CE
dbCAN_4:dbCAN-sub 0.9381 0.1101 0.9029 0.9733 Bacteria CE
CUPP 0.9000 0.1196 0.8617 0.9382 Bacteria CE
dbCAN_2 0.9415 0.1162 0.9022 0.9808 Eukaryote CE
dbCAN_2:HMMER 0.9721 0.0682 0.9490 0.9952 Eukaryote CE
dbCAN_2:DIAMOND 0.8501 0.2777 0.7561 0.9440 Eukaryote CE
dbCAN_2:Hotpep 0.8360 0.1836 0.7739 0.8981 Eukaryote CE
dbCAN_3 0.9563 0.1017 0.9219 0.9908 Eukaryote CE
dbCAN_3:HMMER 0.9777 0.0616 0.9568 0.9985 Eukaryote CE
dbCAN_3:DIAMOND 0.9219 0.1931 0.8566 0.9873 Eukaryote CE
dbCAN_3:eCAMI 0.8169 0.2274 0.7400 0.8939 Eukaryote CE
dbCAN_4 0.9902 0.0287 0.9805 1.0000 Eukaryote CE
dbCAN_4:HMMER 0.9869 0.0309 0.9765 0.9974 Eukaryote CE
dbCAN_4:DIAMOND 0.9738 0.0823 0.9459 1.0016 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9880 0.0335 0.9766 0.9993 Eukaryote CE
CUPP 0.9528 0.0907 0.9221 0.9835 Eukaryote CE
dbCAN_2 0.9104 0.1322 0.8720 0.9488 All AA
dbCAN_2:HMMER 0.9173 0.1137 0.8843 0.9503 All AA
dbCAN_2:DIAMOND 0.8766 0.1603 0.8301 0.9232 All AA
dbCAN_2:Hotpep 0.8733 0.1924 0.8174 0.9292 All AA
dbCAN_3 0.9436 0.1100 0.9117 0.9756 All AA
dbCAN_3:HMMER 0.9434 0.1112 0.9111 0.9757 All AA
dbCAN_3:DIAMOND 0.9413 0.1144 0.9080 0.9745 All AA
dbCAN_3:eCAMI 0.8006 0.2558 0.7263 0.8749 All AA
dbCAN_4 0.9464 0.1085 0.9149 0.9779 All AA
dbCAN_4:HMMER 0.9434 0.1112 0.9111 0.9757 All AA
dbCAN_4:DIAMOND 0.9306 0.1226 0.8950 0.9662 All AA
dbCAN_4:dbCAN-sub 0.9430 0.1090 0.9113 0.9746 All AA
CUPP 0.9011 0.1376 0.8611 0.9410 All AA
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:Hotpep 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:eCAMI 0.9091 0.3015 0.7065 1.1116 Bacteria AA
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:DIAMOND 0.9697 0.1005 0.9022 1.0372 Bacteria AA
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria AA
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2 0.8838 0.1401 0.8371 0.9305 Eukaryote AA
dbCAN_2:HMMER 0.8927 0.1190 0.8530 0.9324 Eukaryote AA
dbCAN_2:DIAMOND 0.8399 0.1659 0.7846 0.8952 Eukaryote AA
dbCAN_2:Hotpep 0.8626 0.1503 0.8125 0.9127 Eukaryote AA
dbCAN_3 0.9269 0.1205 0.8867 0.9670 Eukaryote AA
dbCAN_3:HMMER 0.9266 0.1219 0.8860 0.9673 Eukaryote AA
dbCAN_3:DIAMOND 0.9238 0.1254 0.8820 0.9656 Eukaryote AA
dbCAN_3:eCAMI 0.7683 0.2356 0.6898 0.8469 Eukaryote AA
dbCAN_4 0.9305 0.1193 0.8907 0.9703 Eukaryote AA
dbCAN_4:HMMER 0.9266 0.1219 0.8860 0.9673 Eukaryote AA
dbCAN_4:DIAMOND 0.9190 0.1274 0.8765 0.9615 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9260 0.1193 0.8863 0.9658 Eukaryote AA
CUPP 0.8716 0.1443 0.8235 0.9197 Eukaryote AA
dbCAN_2 0.7970 0.1542 0.7627 0.8314 All CBM
dbCAN_2:HMMER 0.5414 0.2150 0.4935 0.5892 All CBM
dbCAN_2:DIAMOND 0.8325 0.1762 0.7933 0.8717 All CBM
dbCAN_2:Hotpep 0.5508 0.1723 0.5125 0.5892 All CBM
dbCAN_3 0.8354 0.1546 0.8010 0.8698 All CBM
dbCAN_3:HMMER 0.5425 0.2134 0.4950 0.5900 All CBM
dbCAN_3:DIAMOND 0.8986 0.1383 0.8678 0.9294 All CBM
dbCAN_3:eCAMI 0.6810 0.2079 0.6347 0.7272 All CBM
dbCAN_4 0.8547 0.1392 0.8237 0.8857 All CBM
dbCAN_4:HMMER 0.5954 0.2202 0.5464 0.6444 All CBM
dbCAN_4:DIAMOND 0.9072 0.1362 0.8768 0.9375 All CBM
dbCAN_4:dbCAN-sub 0.8576 0.1492 0.8244 0.8908 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.8144 0.1600 0.7632 0.8656 Bacteria CBM
dbCAN_2:HMMER 0.6348 0.1986 0.5712 0.6983 Bacteria CBM
dbCAN_2:DIAMOND 0.8327 0.1875 0.7727 0.8926 Bacteria CBM
dbCAN_2:Hotpep 0.5583 0.2060 0.4925 0.6242 Bacteria CBM
dbCAN_3 0.8330 0.1861 0.7734 0.8925 Bacteria CBM
dbCAN_3:HMMER 0.6337 0.1981 0.5704 0.6971 Bacteria CBM
dbCAN_3:DIAMOND 0.8648 0.1691 0.8108 0.9189 Bacteria CBM
dbCAN_3:eCAMI 0.6709 0.2250 0.5989 0.7428 Bacteria CBM
dbCAN_4 0.9296 0.0604 0.9103 0.9489 Bacteria CBM
dbCAN_4:HMMER 0.7163 0.1850 0.6571 0.7755 Bacteria CBM
dbCAN_4:DIAMOND 0.8607 0.1698 0.8063 0.9150 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9484 0.0628 0.9283 0.9685 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.7797 0.1483 0.7322 0.8271 Eukaryote CBM
dbCAN_2:HMMER 0.4480 0.1904 0.3871 0.5089 Eukaryote CBM
dbCAN_2:DIAMOND 0.8324 0.1666 0.7791 0.8857 Eukaryote CBM
dbCAN_2:Hotpep 0.5433 0.1325 0.5009 0.5856 Eukaryote CBM
dbCAN_3 0.8378 0.1173 0.8003 0.8754 Eukaryote CBM
dbCAN_3:HMMER 0.4512 0.1895 0.3906 0.5118 Eukaryote CBM
dbCAN_3:DIAMOND 0.9323 0.0884 0.9040 0.9606 Eukaryote CBM
dbCAN_3:eCAMI 0.6911 0.1917 0.6297 0.7524 Eukaryote CBM
dbCAN_4 0.7798 0.1552 0.7302 0.8295 Eukaryote CBM
dbCAN_4:HMMER 0.4745 0.1843 0.4155 0.5334 Eukaryote CBM
dbCAN_4:DIAMOND 0.9537 0.0657 0.9327 0.9747 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.7669 0.1557 0.7171 0.8166 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

8.2.5 Accuracy

Table 8.10: Overall performance (represented by the Accuracy) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9658 0.0361 0.9578 0.9738 All GH
dbCAN_2:HMMER 0.9536 0.0330 0.9463 0.9609 All GH
dbCAN_2:DIAMOND 0.9644 0.0415 0.9551 0.9736 All GH
dbCAN_2:Hotpep 0.9297 0.0495 0.9187 0.9407 All GH
dbCAN_3 0.9768 0.0304 0.9701 0.9836 All GH
dbCAN_3:HMMER 0.9599 0.0331 0.9525 0.9672 All GH
dbCAN_3:DIAMOND 0.9812 0.0295 0.9747 0.9878 All GH
dbCAN_3:eCAMI 0.9364 0.0485 0.9256 0.9472 All GH
dbCAN_4 0.9743 0.0307 0.9675 0.9812 All GH
dbCAN_4:HMMER 0.9601 0.0328 0.9528 0.9674 All GH
dbCAN_4:DIAMOND 0.9812 0.0304 0.9745 0.9880 All GH
dbCAN_4:dbCAN-sub 0.9731 0.0297 0.9665 0.9797 All GH
CUPP 0.9540 0.0293 0.9475 0.9605 All GH
dbCAN_2 0.9660 0.0387 0.9536 0.9784 Bacteria GH
dbCAN_2:HMMER 0.9544 0.0399 0.9416 0.9672 Bacteria GH
dbCAN_2:DIAMOND 0.9611 0.0445 0.9469 0.9753 Bacteria GH
dbCAN_2:Hotpep 0.9375 0.0449 0.9231 0.9518 Bacteria GH
dbCAN_3 0.9788 0.0346 0.9678 0.9899 Bacteria GH
dbCAN_3:HMMER 0.9579 0.0409 0.9448 0.9710 Bacteria GH
dbCAN_3:DIAMOND 0.9773 0.0324 0.9669 0.9877 Bacteria GH
dbCAN_3:eCAMI 0.9472 0.0502 0.9312 0.9633 Bacteria GH
dbCAN_4 0.9762 0.0378 0.9641 0.9883 Bacteria GH
dbCAN_4:HMMER 0.9584 0.0405 0.9454 0.9713 Bacteria GH
dbCAN_4:DIAMOND 0.9766 0.0361 0.9650 0.9881 Bacteria GH
dbCAN_4:dbCAN-sub 0.9765 0.0361 0.9649 0.9880 Bacteria GH
CUPP 0.9555 0.0313 0.9455 0.9655 Bacteria GH
dbCAN_2 0.9656 0.0337 0.9548 0.9763 Eukaryote GH
dbCAN_2:HMMER 0.9528 0.0247 0.9449 0.9607 Eukaryote GH
dbCAN_2:DIAMOND 0.9676 0.0386 0.9553 0.9800 Eukaryote GH
dbCAN_2:Hotpep 0.9219 0.0531 0.9050 0.9389 Eukaryote GH
dbCAN_3 0.9748 0.0259 0.9666 0.9831 Eukaryote GH
dbCAN_3:HMMER 0.9619 0.0231 0.9545 0.9692 Eukaryote GH
dbCAN_3:DIAMOND 0.9852 0.0261 0.9768 0.9935 Eukaryote GH
dbCAN_3:eCAMI 0.9255 0.0446 0.9113 0.9398 Eukaryote GH
dbCAN_4 0.9725 0.0218 0.9655 0.9795 Eukaryote GH
dbCAN_4:HMMER 0.9619 0.0231 0.9545 0.9692 Eukaryote GH
dbCAN_4:DIAMOND 0.9859 0.0230 0.9785 0.9933 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9698 0.0215 0.9630 0.9767 Eukaryote GH
CUPP 0.9525 0.0275 0.9437 0.9613 Eukaryote GH
dbCAN_2 0.9550 0.0770 0.9379 0.9721 All GT
dbCAN_2:HMMER 0.9498 0.0598 0.9365 0.9631 All GT
dbCAN_2:DIAMOND 0.9672 0.0821 0.9489 0.9854 All GT
dbCAN_2:Hotpep 0.9031 0.0908 0.8829 0.9233 All GT
dbCAN_3 0.9751 0.0585 0.9621 0.9881 All GT
dbCAN_3:HMMER 0.9503 0.0597 0.9370 0.9635 All GT
dbCAN_3:DIAMOND 0.9848 0.0567 0.9722 0.9974 All GT
dbCAN_3:eCAMI 0.9413 0.0822 0.9230 0.9596 All GT
dbCAN_4 0.9781 0.0572 0.9654 0.9909 All GT
dbCAN_4:HMMER 0.9503 0.0597 0.9370 0.9635 All GT
dbCAN_4:DIAMOND 0.9838 0.0577 0.9710 0.9966 All GT
dbCAN_4:dbCAN-sub 0.9773 0.0519 0.9657 0.9889 All GT
CUPP 0.9462 0.0579 0.9333 0.9591 All GT
dbCAN_2 0.9581 0.0752 0.9340 0.9821 Bacteria GT
dbCAN_2:HMMER 0.9469 0.0757 0.9227 0.9711 Bacteria GT
dbCAN_2:DIAMOND 0.9704 0.0804 0.9447 0.9961 Bacteria GT
dbCAN_2:Hotpep 0.8982 0.0676 0.8766 0.9199 Bacteria GT
dbCAN_3 0.9734 0.0739 0.9497 0.9970 Bacteria GT
dbCAN_3:HMMER 0.9463 0.0752 0.9223 0.9704 Bacteria GT
dbCAN_3:DIAMOND 0.9827 0.0721 0.9596 1.0057 Bacteria GT
dbCAN_3:eCAMI 0.9521 0.0640 0.9317 0.9726 Bacteria GT
dbCAN_4 0.9764 0.0732 0.9530 0.9998 Bacteria GT
dbCAN_4:HMMER 0.9466 0.0753 0.9225 0.9706 Bacteria GT
dbCAN_4:DIAMOND 0.9797 0.0738 0.9561 1.0033 Bacteria GT
dbCAN_4:dbCAN-sub 0.9770 0.0650 0.9562 0.9978 Bacteria GT
CUPP 0.9602 0.0488 0.9445 0.9758 Bacteria GT
dbCAN_2 0.9520 0.0795 0.9265 0.9774 Eukaryote GT
dbCAN_2:HMMER 0.9527 0.0386 0.9403 0.9650 Eukaryote GT
dbCAN_2:DIAMOND 0.9639 0.0847 0.9368 0.9910 Eukaryote GT
dbCAN_2:Hotpep 0.9079 0.1099 0.8728 0.9430 Eukaryote GT
dbCAN_3 0.9768 0.0384 0.9646 0.9891 Eukaryote GT
dbCAN_3:HMMER 0.9542 0.0390 0.9417 0.9667 Eukaryote GT
dbCAN_3:DIAMOND 0.9869 0.0362 0.9754 0.9985 Eukaryote GT
dbCAN_3:eCAMI 0.9304 0.0967 0.8995 0.9614 Eukaryote GT
dbCAN_4 0.9798 0.0354 0.9685 0.9911 Eukaryote GT
dbCAN_4:HMMER 0.9540 0.0389 0.9415 0.9664 Eukaryote GT
dbCAN_4:DIAMOND 0.9879 0.0354 0.9766 0.9992 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9776 0.0352 0.9664 0.9889 Eukaryote GT
CUPP 0.9322 0.0633 0.9120 0.9525 Eukaryote GT
dbCAN_2 0.9959 0.0070 0.9938 0.9979 All PL
dbCAN_2:HMMER 0.9963 0.0062 0.9944 0.9981 All PL
dbCAN_2:DIAMOND 0.9954 0.0073 0.9933 0.9976 All PL
dbCAN_2:Hotpep 0.9929 0.0131 0.9891 0.9968 All PL
dbCAN_3 0.9990 0.0030 0.9981 0.9999 All PL
dbCAN_3:HMMER 0.9986 0.0035 0.9975 0.9996 All PL
dbCAN_3:DIAMOND 0.9985 0.0041 0.9973 0.9997 All PL
dbCAN_3:eCAMI 0.9920 0.0129 0.9883 0.9958 All PL
dbCAN_4 0.9986 0.0035 0.9975 0.9996 All PL
dbCAN_4:HMMER 0.9986 0.0035 0.9975 0.9996 All PL
dbCAN_4:DIAMOND 0.9988 0.0039 0.9976 0.9999 All PL
dbCAN_4:dbCAN-sub 0.9988 0.0032 0.9978 0.9997 All PL
CUPP 0.9941 0.0096 0.9913 0.9970 All PL
dbCAN_2 0.9957 0.0077 0.9928 0.9986 Bacteria PL
dbCAN_2:HMMER 0.9960 0.0067 0.9935 0.9985 Bacteria PL
dbCAN_2:DIAMOND 0.9957 0.0077 0.9928 0.9986 Bacteria PL
dbCAN_2:Hotpep 0.9917 0.0157 0.9859 0.9976 Bacteria PL
dbCAN_3 0.9990 0.0030 0.9979 1.0001 Bacteria PL
dbCAN_3:HMMER 0.9983 0.0038 0.9969 0.9997 Bacteria PL
dbCAN_3:DIAMOND 0.9990 0.0030 0.9979 1.0001 Bacteria PL
dbCAN_3:eCAMI 0.9911 0.0153 0.9854 0.9967 Bacteria PL
dbCAN_4 0.9983 0.0038 0.9969 0.9997 Bacteria PL
dbCAN_4:HMMER 0.9983 0.0038 0.9969 0.9997 Bacteria PL
dbCAN_4:DIAMOND 0.9990 0.0030 0.9979 1.0001 Bacteria PL
dbCAN_4:dbCAN-sub 0.9987 0.0034 0.9974 1.0000 Bacteria PL
CUPP 0.9930 0.0112 0.9888 0.9972 Bacteria PL
dbCAN_2 0.9962 0.0056 0.9933 0.9990 Eukaryote PL
dbCAN_2:HMMER 0.9967 0.0055 0.9939 0.9995 Eukaryote PL
dbCAN_2:DIAMOND 0.9950 0.0067 0.9915 0.9984 Eukaryote PL
dbCAN_2:Hotpep 0.9950 0.0066 0.9916 0.9984 Eukaryote PL
dbCAN_3 0.9989 0.0030 0.9974 1.0005 Eukaryote PL
dbCAN_3:HMMER 0.9989 0.0030 0.9974 1.0005 Eukaryote PL
dbCAN_3:DIAMOND 0.9977 0.0055 0.9948 1.0006 Eukaryote PL
dbCAN_3:eCAMI 0.9938 0.0065 0.9905 0.9972 Eukaryote PL
dbCAN_4 0.9990 0.0029 0.9974 1.0005 Eukaryote PL
dbCAN_4:HMMER 0.9989 0.0030 0.9974 1.0005 Eukaryote PL
dbCAN_4:DIAMOND 0.9983 0.0052 0.9956 1.0010 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9990 0.0029 0.9975 1.0005 Eukaryote PL
CUPP 0.9961 0.0056 0.9933 0.9990 Eukaryote PL
dbCAN_2 0.9893 0.0225 0.9842 0.9945 All CE
dbCAN_2:HMMER 0.9901 0.0189 0.9858 0.9945 All CE
dbCAN_2:DIAMOND 0.9862 0.0242 0.9807 0.9918 All CE
dbCAN_2:Hotpep 0.9821 0.0251 0.9764 0.9878 All CE
dbCAN_3 0.9896 0.0220 0.9846 0.9946 All CE
dbCAN_3:HMMER 0.9910 0.0177 0.9870 0.9951 All CE
dbCAN_3:DIAMOND 0.9890 0.0231 0.9837 0.9943 All CE
dbCAN_3:eCAMI 0.9818 0.0249 0.9761 0.9875 All CE
dbCAN_4 0.9943 0.0172 0.9904 0.9982 All CE
dbCAN_4:HMMER 0.9929 0.0173 0.9889 0.9968 All CE
dbCAN_4:DIAMOND 0.9919 0.0218 0.9870 0.9969 All CE
dbCAN_4:dbCAN-sub 0.9938 0.0175 0.9898 0.9978 All CE
CUPP 0.9900 0.0187 0.9857 0.9943 All CE
dbCAN_2 0.9849 0.0283 0.9758 0.9939 Bacteria CE
dbCAN_2:HMMER 0.9840 0.0239 0.9764 0.9916 Bacteria CE
dbCAN_2:DIAMOND 0.9824 0.0282 0.9734 0.9915 Bacteria CE
dbCAN_2:Hotpep 0.9787 0.0309 0.9688 0.9886 Bacteria CE
dbCAN_3 0.9841 0.0280 0.9752 0.9931 Bacteria CE
dbCAN_3:HMMER 0.9852 0.0224 0.9780 0.9923 Bacteria CE
dbCAN_3:DIAMOND 0.9864 0.0268 0.9778 0.9950 Bacteria CE
dbCAN_3:eCAMI 0.9777 0.0303 0.9680 0.9874 Bacteria CE
dbCAN_4 0.9904 0.0228 0.9831 0.9977 Bacteria CE
dbCAN_4:HMMER 0.9882 0.0225 0.9810 0.9954 Bacteria CE
dbCAN_4:DIAMOND 0.9886 0.0263 0.9801 0.9970 Bacteria CE
dbCAN_4:dbCAN-sub 0.9899 0.0230 0.9826 0.9973 Bacteria CE
CUPP 0.9852 0.0236 0.9777 0.9927 Bacteria CE
dbCAN_2 0.9943 0.0119 0.9902 0.9983 Eukaryote CE
dbCAN_2:HMMER 0.9970 0.0062 0.9949 0.9991 Eukaryote CE
dbCAN_2:DIAMOND 0.9905 0.0182 0.9843 0.9966 Eukaryote CE
dbCAN_2:Hotpep 0.9859 0.0159 0.9805 0.9913 Eukaryote CE
dbCAN_3 0.9957 0.0095 0.9924 0.9989 Eukaryote CE
dbCAN_3:HMMER 0.9975 0.0054 0.9957 0.9994 Eukaryote CE
dbCAN_3:DIAMOND 0.9919 0.0181 0.9858 0.9980 Eukaryote CE
dbCAN_3:eCAMI 0.9863 0.0162 0.9808 0.9918 Eukaryote CE
dbCAN_4 0.9986 0.0041 0.9972 1.0000 Eukaryote CE
dbCAN_4:HMMER 0.9981 0.0046 0.9965 0.9996 Eukaryote CE
dbCAN_4:DIAMOND 0.9957 0.0147 0.9907 1.0007 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9981 0.0055 0.9963 1.0000 Eukaryote CE
CUPP 0.9953 0.0086 0.9924 0.9983 Eukaryote CE
dbCAN_2 0.9856 0.0204 0.9796 0.9915 All AA
dbCAN_2:HMMER 0.9856 0.0202 0.9797 0.9914 All AA
dbCAN_2:DIAMOND 0.9837 0.0212 0.9775 0.9899 All AA
dbCAN_2:Hotpep 0.9827 0.0227 0.9761 0.9893 All AA
dbCAN_3 0.9901 0.0188 0.9846 0.9955 All AA
dbCAN_3:HMMER 0.9901 0.0196 0.9844 0.9958 All AA
dbCAN_3:DIAMOND 0.9897 0.0200 0.9839 0.9955 All AA
dbCAN_3:eCAMI 0.9791 0.0233 0.9724 0.9859 All AA
dbCAN_4 0.9907 0.0188 0.9853 0.9962 All AA
dbCAN_4:HMMER 0.9901 0.0196 0.9844 0.9958 All AA
dbCAN_4:DIAMOND 0.9897 0.0198 0.9840 0.9954 All AA
dbCAN_4:dbCAN-sub 0.9902 0.0189 0.9847 0.9956 All AA
CUPP 0.9841 0.0215 0.9778 0.9903 All AA
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2:Hotpep 0.9991 0.0030 0.9971 1.0011 Bacteria AA
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_3:eCAMI 0.9991 0.0030 0.9971 1.0011 Bacteria AA
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_4:DIAMOND 0.9991 0.0029 0.9972 1.0011 Bacteria AA
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria AA
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria AA
dbCAN_2 0.9813 0.0215 0.9741 0.9884 Eukaryote AA
dbCAN_2:HMMER 0.9813 0.0212 0.9742 0.9884 Eukaryote AA
dbCAN_2:DIAMOND 0.9789 0.0220 0.9715 0.9862 Eukaryote AA
dbCAN_2:Hotpep 0.9778 0.0237 0.9699 0.9857 Eukaryote AA
dbCAN_3 0.9871 0.0206 0.9803 0.9940 Eukaryote AA
dbCAN_3:HMMER 0.9872 0.0215 0.9800 0.9943 Eukaryote AA
dbCAN_3:DIAMOND 0.9866 0.0219 0.9793 0.9939 Eukaryote AA
dbCAN_3:eCAMI 0.9732 0.0234 0.9654 0.9810 Eukaryote AA
dbCAN_4 0.9880 0.0207 0.9811 0.9949 Eukaryote AA
dbCAN_4:HMMER 0.9872 0.0215 0.9800 0.9943 Eukaryote AA
dbCAN_4:DIAMOND 0.9869 0.0217 0.9797 0.9942 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9872 0.0207 0.9803 0.9941 Eukaryote AA
CUPP 0.9793 0.0225 0.9719 0.9868 Eukaryote AA
dbCAN_2 0.9656 0.0257 0.9599 0.9713 All CBM
dbCAN_2:HMMER 0.9378 0.0296 0.9312 0.9444 All CBM
dbCAN_2:DIAMOND 0.9719 0.0252 0.9663 0.9775 All CBM
dbCAN_2:Hotpep 0.8823 0.0551 0.8701 0.8946 All CBM
dbCAN_3 0.9718 0.0228 0.9667 0.9768 All CBM
dbCAN_3:HMMER 0.9379 0.0300 0.9313 0.9446 All CBM
dbCAN_3:DIAMOND 0.9814 0.0185 0.9773 0.9855 All CBM
dbCAN_3:eCAMI 0.9267 0.0565 0.9141 0.9392 All CBM
dbCAN_4 0.9763 0.0207 0.9717 0.9809 All CBM
dbCAN_4:HMMER 0.9436 0.0300 0.9370 0.9503 All CBM
dbCAN_4:DIAMOND 0.9830 0.0182 0.9790 0.9870 All CBM
dbCAN_4:dbCAN-sub 0.9761 0.0244 0.9707 0.9816 All CBM
CUPP 0.8851 0.0851 0.8662 0.9040 All CBM
dbCAN_2 0.9672 0.0266 0.9587 0.9758 Bacteria CBM
dbCAN_2:HMMER 0.9400 0.0356 0.9286 0.9514 Bacteria CBM
dbCAN_2:DIAMOND 0.9718 0.0240 0.9641 0.9795 Bacteria CBM
dbCAN_2:Hotpep 0.8735 0.0634 0.8532 0.8937 Bacteria CBM
dbCAN_3 0.9707 0.0237 0.9631 0.9783 Bacteria CBM
dbCAN_3:HMMER 0.9400 0.0364 0.9283 0.9516 Bacteria CBM
dbCAN_3:DIAMOND 0.9760 0.0178 0.9703 0.9817 Bacteria CBM
dbCAN_3:eCAMI 0.9171 0.0605 0.8978 0.9365 Bacteria CBM
dbCAN_4 0.9855 0.0128 0.9814 0.9896 Bacteria CBM
dbCAN_4:HMMER 0.9494 0.0365 0.9377 0.9611 Bacteria CBM
dbCAN_4:DIAMOND 0.9755 0.0181 0.9697 0.9812 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9892 0.0129 0.9851 0.9933 Bacteria CBM
CUPP 0.8614 0.1127 0.8253 0.8974 Bacteria CBM
dbCAN_2 0.9640 0.0250 0.9560 0.9720 Eukaryote CBM
dbCAN_2:HMMER 0.9356 0.0223 0.9284 0.9427 Eukaryote CBM
dbCAN_2:DIAMOND 0.9720 0.0267 0.9635 0.9806 Eukaryote CBM
dbCAN_2:Hotpep 0.8912 0.0445 0.8770 0.9054 Eukaryote CBM
dbCAN_3 0.9728 0.0221 0.9658 0.9799 Eukaryote CBM
dbCAN_3:HMMER 0.9359 0.0221 0.9288 0.9430 Eukaryote CBM
dbCAN_3:DIAMOND 0.9869 0.0178 0.9812 0.9926 Eukaryote CBM
dbCAN_3:eCAMI 0.9362 0.0512 0.9198 0.9526 Eukaryote CBM
dbCAN_4 0.9672 0.0230 0.9598 0.9745 Eukaryote CBM
dbCAN_4:HMMER 0.9379 0.0205 0.9313 0.9444 Eukaryote CBM
dbCAN_4:DIAMOND 0.9905 0.0150 0.9857 0.9953 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.9631 0.0262 0.9547 0.9715 Eukaryote CBM
CUPP 0.9088 0.0286 0.8997 0.9180 Eukaryote CBM

9 CAZy class multilabel classification tax performance

Table 9.1: Rand Index of CAZyme classifier classification of CAZy class annotations
Prediction_tool Lower CI Mean Upper CI Standard Deviation Tax_group
dbCAN_2 0.9755 0.9769 0.9782 0.0855 All
dbCAN_2:HMMER 0.9679 0.9694 0.9709 0.0969 All
dbCAN_2:DIAMOND 0.9778 0.9791 0.9804 0.0821 All
dbCAN_2:Hotpep 0.9445 0.9464 0.9483 0.1248 All
dbCAN_3 0.9828 0.9839 0.9850 0.0721 All
dbCAN_3:HMMER 0.9699 0.9714 0.9728 0.0939 All
dbCAN_3:DIAMOND 0.9865 0.9875 0.9885 0.0643 All
dbCAN_3:eCAMI 0.9593 0.9610 0.9627 0.1092 All
dbCAN_4 0.9845 0.9856 0.9866 0.0683 All
dbCAN_4:HMMER 0.9712 0.9727 0.9741 0.0920 All
dbCAN_4:DIAMOND 0.9871 0.9880 0.9890 0.0627 All
dbCAN_4:dbCAN-sub 0.9840 0.9851 0.9861 0.0694 All
CUPP 0.9580 0.9597 0.9614 0.1093 All
dbCAN_2 0.9766 0.9784 0.9802 0.0825 Bacteria
dbCAN_2:HMMER 0.9679 0.9700 0.9721 0.0958 Bacteria
dbCAN_2:DIAMOND 0.9784 0.9802 0.9819 0.0801 Bacteria
dbCAN_2:Hotpep 0.9438 0.9466 0.9493 0.1249 Bacteria
dbCAN_3 0.9825 0.9840 0.9856 0.0716 Bacteria
dbCAN_3:HMMER 0.9690 0.9710 0.9731 0.0944 Bacteria
dbCAN_3:DIAMOND 0.9853 0.9867 0.9882 0.0662 Bacteria
dbCAN_3:eCAMI 0.9619 0.9642 0.9666 0.1054 Bacteria
dbCAN_4 0.9863 0.9877 0.9891 0.0632 Bacteria
dbCAN_4:HMMER 0.9712 0.9732 0.9752 0.0911 Bacteria
dbCAN_4:DIAMOND 0.9848 0.9863 0.9877 0.0669 Bacteria
dbCAN_4:dbCAN-sub 0.9870 0.9884 0.9897 0.0614 Bacteria
CUPP 0.9568 0.9592 0.9616 0.1101 Bacteria
dbCAN_2 0.9734 0.9753 0.9772 0.0885 Eukaryote
dbCAN_2:HMMER 0.9666 0.9688 0.9709 0.0979 Eukaryote
dbCAN_2:DIAMOND 0.9762 0.9781 0.9799 0.0840 Eukaryote
dbCAN_2:Hotpep 0.9435 0.9462 0.9490 0.1248 Eukaryote
dbCAN_3 0.9822 0.9838 0.9854 0.0725 Eukaryote
dbCAN_3:HMMER 0.9697 0.9717 0.9737 0.0934 Eukaryote
dbCAN_3:DIAMOND 0.9869 0.9883 0.9896 0.0624 Eukaryote
dbCAN_3:eCAMI 0.9553 0.9577 0.9602 0.1128 Eukaryote
dbCAN_4 0.9819 0.9835 0.9851 0.0730 Eukaryote
dbCAN_4:HMMER 0.9701 0.9721 0.9741 0.0930 Eukaryote
dbCAN_4:DIAMOND 0.9886 0.9898 0.9911 0.0582 Eukaryote
dbCAN_4:dbCAN-sub 0.9801 0.9817 0.9834 0.0765 Eukaryote
CUPP 0.9578 0.9602 0.9625 0.1086 Eukaryote

10 CAZy family classification

The following section evaluates the performance of the CAZyme classifiers to predict CAZy family classifications.

10.2 CAZy family sensitivity against specificity

For better resolution we can group the CAZy families by their parent CAzy classes, and compare the performances of the tools CAZy class, by CAZy class. Owing to the minimal variation in specificity scores, specificity was plotted as the percentage specificity log10.

10.2.1 Glycoside Hydrolases

Figure 10.13 shows the plotting of sensitivity against specificity for each Glycoside Hydrolase CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Glycoside Hydrolases. Each GH CAZy family is represented as a single point on the plot.

Figure 10.13: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Glycoside Hydrolases. Each GH CAZy family is represented as a single point on the plot.

10.2.2 Glycosyltransferases

Figure 10.14 shows the plotting of sensitivity against specificity for each Glycosyltransferases CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Glycosyltransferases. Each GT CAZy family is represented as a single point on the plot.

Figure 10.14: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Glycosyltransferases. Each GT CAZy family is represented as a single point on the plot.

10.2.3 Polysaccharide Lyases

Figure 10.13 shows the plotting of sensitivity against specificity for each Polysaccharide Lyases CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Polysaccharide Lyases. Each PL CAZy family is represented as a single point on the plot.

Figure 10.15: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Polysaccharide Lyases. Each PL CAZy family is represented as a single point on the plot.

10.2.4 Carbohydrate Esterases

Figure 10.16 shows the plotting of sensitivity against specificity for each Carbohydrate Esterases CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Carbohydrate Esterases. Each CE CAZy family is represented as a single point on the plot.

Figure 10.16: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Carbohydrate Esterases. Each CE CAZy family is represented as a single point on the plot.

10.2.5 Auxillary Activities

Figure ?? shows the plotting of sensitivity against specificity for each Auxillary Activities CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Auxillary Activities. Each AA CAZy family is represented as a single point on the plot.

Figure 10.17: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Auxillary Activities. Each AA CAZy family is represented as a single point on the plot.

10.2.6 Carbohydate Binding Modules

Figure 10.18 shows the plotting of sensitivity against specificity for each Carbohydrate Binding Module CAZy family.

Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Carbohydrate Binding Modules. Each CBM CAZy family is represented as a single point on the plot.

Figure 10.18: Scatter plot of recall (sensitivity) against specificity for predicting each CAZy family for each CAZyme classifier in the CAZy class Carbohydrate Binding Modules. Each CBM CAZy family is represented as a single point on the plot.

10.3 Consistently poor performing CAZy families

We can pull out CAZy families for which at least three of the evaluated classifiers (when including the individual tools incoporated into dbCAN) produce a sensitivity score of less than 0.75 for said CAZy family.

10.4 GH difficult families

10.5 GT difficult families

10.6 PL difficult families

10.7 CE difficult families

10.8 AA difficult families

10.9 CBM difficult families

10.10 Evaluation of multi-label CAZy family classification performance

CAZy annotates proteins in a domain-wise manner. Consequently, a single protein may be assigned to multiple CAZy families. The ability of a classifier to assign all the correct CAZy family annotations for a given protein when only evaluating the CAZy family classification performance per CAZy family, independently of all other CAZy classes.

Multilabel classification raises when a single instance can be assinged to multiple classes. In this evaluation a single instance is a protein and the classes are CAZy families, a single CAZyme can be assigned to multiple CAZy families. This is important to take into consideration because the same approaches for statistical evaluation of binary classification provided a limited view of the performance of the classifiers when applied to multilabel classification.

The CAZy family multi-label classification performance is represented by the Rand Index (RI) and Adjusted Rand Index (ARI). The RI is a quantitive measure of similarity between two clusterings by considering all pairs of samples and counting pairs that are assigned in the same or different clusters in the predicted and true clusterings. In this case the two clusters are the predicted and groud truth CAZy family annotations. The raw RI score is then “adjusted for chance” into the ARI score using the following scheme:
ARI = (RI - Expected_RI) / (max(RI) - Expected_RI) This produces a score between 1 and -1. A score of 1 is produced if all predicted and known CAZy family annotations are identical, 0 if completely random clustering of -1 if systematically incorrect clustering and the number of incorrect classifications of proteins is greater than would be expected from randomly annotating proteins with CAZy families.

Table 10.8: Rand Index of CAZyme classifier classification of CAZy family annotations
Prediction_tool Mean Standard Deviation Lower CI Upper CI
dbCAN_2 0.9996 0.0013 0.9996 0.9997
dbCAN_2:HMMER 0.9995 0.0014 0.9995 0.9996
dbCAN_2:DIAMOND 0.9997 0.0012 0.9997 0.9997
dbCAN_2:Hotpep 0.9991 0.0023 0.9990 0.9991
dbCAN_3 0.9998 0.0011 0.9997 0.9998
dbCAN_3:HMMER 0.9996 0.0014 0.9995 0.9996
dbCAN_3:DIAMOND 0.9998 0.0010 0.9998 0.9998
dbCAN_3:eCAMI 0.9994 0.0017 0.9994 0.9994
dbCAN_4 0.9998 0.0011 0.9997 0.9998
dbCAN_4:HMMER 0.9996 0.0014 0.9996 0.9996
dbCAN_4:DIAMOND 0.9998 0.0009 0.9998 0.9998
dbCAN_4:dbCAN-sub 0.9998 0.0011 0.9997 0.9998
CUPP 0.9994 0.0015 0.9994 0.9995
Table 10.9: Adjusted Rand Index of CAZyme classifier classification of CAZy family annotations
Prediction_tool Mean Standard Deviation Lower CI Upper CI
dbCAN_2 0.9259 0.2568 0.9219 0.9299
dbCAN_2:HMMER 0.9145 0.2702 0.9103 0.9187
dbCAN_2:DIAMOND 0.9386 0.2367 0.9349 0.9422
dbCAN_2:Hotpep 0.8635 0.3219 0.8585 0.8684
dbCAN_3 0.9507 0.2127 0.9474 0.9540
dbCAN_3:HMMER 0.9201 0.2617 0.9161 0.9242
dbCAN_3:DIAMOND 0.9622 0.1878 0.9593 0.9651
dbCAN_3:eCAMI 0.8960 0.2928 0.8914 0.9005
dbCAN_4 0.9538 0.2050 0.9507 0.9570
dbCAN_4:HMMER 0.9221 0.2598 0.9180 0.9261
dbCAN_4:DIAMOND 0.9636 0.1847 0.9608 0.9665
dbCAN_4:dbCAN-sub 0.9544 0.2032 0.9512 0.9575
CUPP 0.9012 0.2832 0.8968 0.9055

11 CAZy family taxonomic performance

11.1 Across all of CAZy

11.1.1 Specificity

Table 11.1: Overall performance (represented by the Specificity) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9999 0.0004 0.9999 0.9999 All
dbCAN_2:HMMER 0.9999 0.0003 0.9998 0.9999 All
dbCAN_2:DIAMOND 0.9999 0.0004 0.9998 0.9999 All
dbCAN_2:Hotpep 0.9994 0.0020 0.9992 0.9996 All
dbCAN_3 0.9999 0.0003 0.9999 0.9999 All
dbCAN_3:HMMER 0.9999 0.0004 0.9998 0.9999 All
dbCAN_3:DIAMOND 0.9999 0.0004 0.9998 0.9999 All
dbCAN_3:eCAMI 0.9997 0.0010 0.9996 0.9998 All
dbCAN_4 0.9999 0.0003 0.9999 0.9999 All
dbCAN_4:HMMER 0.9999 0.0004 0.9998 0.9999 All
dbCAN_4:DIAMOND 0.9999 0.0003 0.9998 0.9999 All
dbCAN_4:dbCAN-sub 0.9999 0.0004 0.9998 0.9999 All
CUPP 0.9999 0.0003 0.9999 1.0000 All
dbCAN_2 0.9999 0.0007 0.9998 1.0000 Bacteria
dbCAN_2:HMMER 0.9999 0.0005 0.9998 0.9999 Bacteria
dbCAN_2:DIAMOND 0.9998 0.0007 0.9998 0.9999 Bacteria
dbCAN_2:Hotpep 0.9989 0.0031 0.9986 0.9993 Bacteria
dbCAN_3 0.9999 0.0006 0.9998 1.0000 Bacteria
dbCAN_3:HMMER 0.9999 0.0005 0.9998 1.0000 Bacteria
dbCAN_3:DIAMOND 0.9999 0.0006 0.9998 0.9999 Bacteria
dbCAN_3:eCAMI 0.9994 0.0018 0.9992 0.9997 Bacteria
dbCAN_4 0.9999 0.0006 0.9998 1.0000 Bacteria
dbCAN_4:HMMER 0.9999 0.0005 0.9998 0.9999 Bacteria
dbCAN_4:DIAMOND 0.9999 0.0006 0.9998 0.9999 Bacteria
dbCAN_4:dbCAN-sub 0.9998 0.0006 0.9998 0.9999 Bacteria
CUPP 0.9999 0.0006 0.9998 1.0000 Bacteria
dbCAN_2 0.9998 0.0005 0.9998 0.9999 Eukaryote
dbCAN_2:HMMER 0.9998 0.0005 0.9997 0.9999 Eukaryote
dbCAN_2:DIAMOND 0.9998 0.0006 0.9997 0.9999 Eukaryote
dbCAN_2:Hotpep 0.9993 0.0023 0.9990 0.9996 Eukaryote
dbCAN_3 0.9998 0.0005 0.9998 0.9999 Eukaryote
dbCAN_3:HMMER 0.9998 0.0006 0.9997 0.9999 Eukaryote
dbCAN_3:DIAMOND 0.9997 0.0006 0.9997 0.9998 Eukaryote
dbCAN_3:eCAMI 0.9997 0.0012 0.9995 0.9998 Eukaryote
dbCAN_4 0.9998 0.0005 0.9997 0.9999 Eukaryote
dbCAN_4:HMMER 0.9998 0.0006 0.9997 0.9999 Eukaryote
dbCAN_4:DIAMOND 0.9997 0.0006 0.9997 0.9998 Eukaryote
dbCAN_4:dbCAN-sub 0.9998 0.0006 0.9997 0.9999 Eukaryote
CUPP 0.9998 0.0005 0.9998 0.9999 Eukaryote

11.1.2 Sensitivity

Table 11.2: Overall performance (represented by the Sensitivity) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.8388 0.2944 0.8065 0.8710 All
dbCAN_2:HMMER 0.8136 0.3417 0.7766 0.8506 All
dbCAN_2:DIAMOND 0.8447 0.2887 0.8131 0.8763 All
dbCAN_2:Hotpep 0.7287 0.3526 0.6910 0.7664 All
dbCAN_3 0.9050 0.2360 0.8791 0.9308 All
dbCAN_3:HMMER 0.8512 0.3052 0.8181 0.8844 All
dbCAN_3:DIAMOND 0.9298 0.2086 0.9069 0.9527 All
dbCAN_3:eCAMI 0.7013 0.3611 0.6624 0.7401 All
dbCAN_4 0.9001 0.2615 0.8716 0.9286 All
dbCAN_4:HMMER 0.8576 0.2976 0.8253 0.8898 All
dbCAN_4:DIAMOND 0.9379 0.1941 0.9166 0.9591 All
dbCAN_4:dbCAN-sub 0.8905 0.2722 0.8609 0.9201 All
CUPP 0.6422 0.4389 0.5940 0.6904 All
dbCAN_2 0.8412 0.3072 0.8025 0.8800 Bacteria
dbCAN_2:HMMER 0.7934 0.3613 0.7487 0.8380 Bacteria
dbCAN_2:DIAMOND 0.8436 0.3080 0.8048 0.8825 Bacteria
dbCAN_2:Hotpep 0.7396 0.3717 0.6943 0.7849 Bacteria
dbCAN_3 0.8911 0.2641 0.8578 0.9245 Bacteria
dbCAN_3:HMMER 0.8282 0.3304 0.7872 0.8692 Bacteria
dbCAN_3:DIAMOND 0.9127 0.2410 0.8823 0.9432 Bacteria
dbCAN_3:eCAMI 0.7243 0.3738 0.6782 0.7704 Bacteria
dbCAN_4 0.8866 0.2805 0.8515 0.9218 Bacteria
dbCAN_4:HMMER 0.8367 0.3218 0.7969 0.8766 Bacteria
dbCAN_4:DIAMOND 0.9170 0.2264 0.8884 0.9456 Bacteria
dbCAN_4:dbCAN-sub 0.8855 0.2805 0.8503 0.9206 Bacteria
CUPP 0.6302 0.4547 0.5728 0.6877 Bacteria
dbCAN_2 0.8287 0.2968 0.7879 0.8694 Eukaryote
dbCAN_2:HMMER 0.8174 0.3407 0.7707 0.8641 Eukaryote
dbCAN_2:DIAMOND 0.8365 0.2824 0.7977 0.8754 Eukaryote
dbCAN_2:Hotpep 0.6620 0.3852 0.6116 0.7124 Eukaryote
dbCAN_3 0.9101 0.2198 0.8798 0.9405 Eukaryote
dbCAN_3:HMMER 0.8496 0.3118 0.8067 0.8924 Eukaryote
dbCAN_3:DIAMOND 0.9563 0.1458 0.9361 0.9765 Eukaryote
dbCAN_3:eCAMI 0.6612 0.3746 0.6108 0.7116 Eukaryote
dbCAN_4 0.9186 0.2378 0.8859 0.9512 Eukaryote
dbCAN_4:HMMER 0.8581 0.3004 0.8168 0.8994 Eukaryote
dbCAN_4:DIAMOND 0.9726 0.1110 0.9572 0.9879 Eukaryote
dbCAN_4:dbCAN-sub 0.9060 0.2565 0.8708 0.9411 Eukaryote
CUPP 0.7038 0.4122 0.6467 0.7610 Eukaryote

11.1.3 Precision

Table 11.3: Overall performance (represented by the Precision) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.8760 0.2971 0.8435 0.9085 All
dbCAN_2:HMMER 0.8292 0.3404 0.7924 0.8661 All
dbCAN_2:DIAMOND 0.8736 0.2891 0.8420 0.9053 All
dbCAN_2:Hotpep 0.7232 0.3959 0.6808 0.7655 All
dbCAN_3 0.9141 0.2404 0.8877 0.9405 All
dbCAN_3:HMMER 0.8629 0.3017 0.8301 0.8957 All
dbCAN_3:DIAMOND 0.9133 0.2227 0.8889 0.9378 All
dbCAN_3:eCAMI 0.7378 0.3890 0.6959 0.7796 All
dbCAN_4 0.8906 0.2705 0.8612 0.9201 All
dbCAN_4:HMMER 0.8695 0.2934 0.8377 0.9013 All
dbCAN_4:DIAMOND 0.9231 0.2059 0.9005 0.9457 All
dbCAN_4:dbCAN-sub 0.8811 0.2802 0.8506 0.9116 All
CUPP 0.6793 0.4519 0.6297 0.7290 All
dbCAN_2 0.8777 0.3103 0.8386 0.9169 Bacteria
dbCAN_2:HMMER 0.8192 0.3643 0.7742 0.8642 Bacteria
dbCAN_2:DIAMOND 0.8667 0.3116 0.8274 0.9060 Bacteria
dbCAN_2:Hotpep 0.7010 0.4110 0.6509 0.7511 Bacteria
dbCAN_3 0.9099 0.2616 0.8768 0.9429 Bacteria
dbCAN_3:HMMER 0.8518 0.3316 0.8107 0.8930 Bacteria
dbCAN_3:DIAMOND 0.9107 0.2448 0.8797 0.9416 Bacteria
dbCAN_3:eCAMI 0.7289 0.3992 0.6797 0.7782 Bacteria
dbCAN_4 0.8872 0.2897 0.8509 0.9235 Bacteria
dbCAN_4:HMMER 0.8606 0.3219 0.8208 0.9005 Bacteria
dbCAN_4:DIAMOND 0.9202 0.2297 0.8911 0.9492 Bacteria
dbCAN_4:dbCAN-sub 0.8817 0.2912 0.8452 0.9182 Bacteria
CUPP 0.6595 0.4672 0.6005 0.7186 Bacteria
dbCAN_2 0.8752 0.2912 0.8352 0.9152 Eukaryote
dbCAN_2:HMMER 0.8385 0.3293 0.7934 0.8836 Eukaryote
dbCAN_2:DIAMOND 0.8895 0.2670 0.8528 0.9263 Eukaryote
dbCAN_2:Hotpep 0.7096 0.4153 0.6553 0.7639 Eukaryote
dbCAN_3 0.9251 0.2104 0.8960 0.9541 Eukaryote
dbCAN_3:HMMER 0.8658 0.2966 0.8250 0.9065 Eukaryote
dbCAN_3:DIAMOND 0.9296 0.1762 0.9052 0.9540 Eukaryote
dbCAN_3:eCAMI 0.7525 0.3938 0.6996 0.8055 Eukaryote
dbCAN_4 0.9083 0.2388 0.8755 0.9411 Eukaryote
dbCAN_4:HMMER 0.8747 0.2840 0.8356 0.9137 Eukaryote
dbCAN_4:DIAMOND 0.9403 0.1526 0.9192 0.9615 Eukaryote
dbCAN_4:dbCAN-sub 0.8960 0.2557 0.8610 0.9310 Eukaryote
CUPP 0.7276 0.4212 0.6692 0.7861 Eukaryote

11.1.4 F1-score

Table 11.4: Overall performance (represented by the F1-score) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.8458 0.2921 0.8139 0.8778 All
dbCAN_2:HMMER 0.8080 0.3374 0.7714 0.8445 All
dbCAN_2:DIAMOND 0.8497 0.2845 0.8185 0.8808 All
dbCAN_2:Hotpep 0.6957 0.3664 0.6565 0.7349 All
dbCAN_3 0.9014 0.2350 0.8757 0.9272 All
dbCAN_3:HMMER 0.8435 0.3009 0.8108 0.8761 All
dbCAN_3:DIAMOND 0.9157 0.2121 0.8924 0.9389 All
dbCAN_3:eCAMI 0.6976 0.3642 0.6584 0.7368 All
dbCAN_4 0.8856 0.2636 0.8569 0.9143 All
dbCAN_4:HMMER 0.8501 0.2933 0.8183 0.8819 All
dbCAN_4:DIAMOND 0.9229 0.1985 0.9012 0.9447 All
dbCAN_4:dbCAN-sub 0.8750 0.2736 0.8452 0.9047 All
CUPP 0.6511 0.4385 0.6029 0.6992 All
dbCAN_2 0.8497 0.3051 0.8112 0.8881 Bacteria
dbCAN_2:HMMER 0.7961 0.3577 0.7519 0.8403 Bacteria
dbCAN_2:DIAMOND 0.8462 0.3042 0.8079 0.8846 Bacteria
dbCAN_2:Hotpep 0.6919 0.3875 0.6446 0.7391 Bacteria
dbCAN_3 0.8917 0.2604 0.8588 0.9246 Bacteria
dbCAN_3:HMMER 0.8294 0.3260 0.7889 0.8698 Bacteria
dbCAN_3:DIAMOND 0.9046 0.2385 0.8745 0.9347 Bacteria
dbCAN_3:eCAMI 0.7081 0.3786 0.6614 0.7548 Bacteria
dbCAN_4 0.8780 0.2819 0.8427 0.9133 Bacteria
dbCAN_4:HMMER 0.8382 0.3172 0.7989 0.8775 Bacteria
dbCAN_4:DIAMOND 0.9119 0.2237 0.8836 0.9402 Bacteria
dbCAN_4:dbCAN-sub 0.8740 0.2826 0.8386 0.9094 Bacteria
CUPP 0.6393 0.4557 0.5817 0.6969 Bacteria
dbCAN_2 0.8388 0.2902 0.7989 0.8786 Eukaryote
dbCAN_2:HMMER 0.8115 0.3339 0.7657 0.8572 Eukaryote
dbCAN_2:DIAMOND 0.8487 0.2704 0.8114 0.8859 Eukaryote
dbCAN_2:Hotpep 0.6610 0.3859 0.6106 0.7115 Eukaryote
dbCAN_3 0.9073 0.2118 0.8780 0.9365 Eukaryote
dbCAN_3:HMMER 0.8408 0.3041 0.7990 0.8825 Eukaryote
dbCAN_3:DIAMOND 0.9360 0.1579 0.9142 0.9579 Eukaryote
dbCAN_3:eCAMI 0.6848 0.3705 0.6349 0.7346 Eukaryote
dbCAN_4 0.9040 0.2366 0.8715 0.9365 Eukaryote
dbCAN_4:HMMER 0.8495 0.2926 0.8093 0.8897 Eukaryote
dbCAN_4:DIAMOND 0.9483 0.1321 0.9300 0.9666 Eukaryote
dbCAN_4:dbCAN-sub 0.8914 0.2540 0.8566 0.9262 Eukaryote
CUPP 0.7066 0.4104 0.6496 0.7635 Eukaryote

11.1.5 Accuracy

Table 11.5: Overall performance (represented by the Accuracy) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation Lower CI Upper CI Tax_group
dbCAN_2 0.9995 0.0014 0.9993 0.9996 All
dbCAN_2:HMMER 0.9993 0.0019 0.9991 0.9996 All
dbCAN_2:DIAMOND 0.9996 0.0010 0.9994 0.9997 All
dbCAN_2:Hotpep 0.9987 0.0032 0.9984 0.9991 All
dbCAN_3 0.9996 0.0010 0.9995 0.9998 All
dbCAN_3:HMMER 0.9994 0.0019 0.9992 0.9996 All
dbCAN_3:DIAMOND 0.9997 0.0007 0.9996 0.9998 All
dbCAN_3:eCAMI 0.9992 0.0017 0.9990 0.9993 All
dbCAN_4 0.9996 0.0012 0.9995 0.9998 All
dbCAN_4:HMMER 0.9994 0.0019 0.9992 0.9996 All
dbCAN_4:DIAMOND 0.9997 0.0007 0.9997 0.9998 All
dbCAN_4:dbCAN-sub 0.9996 0.0012 0.9995 0.9998 All
CUPP 0.9992 0.0021 0.9990 0.9994 All
dbCAN_2 0.9993 0.0020 0.9991 0.9996 Bacteria
dbCAN_2:HMMER 0.9992 0.0027 0.9988 0.9995 Bacteria
dbCAN_2:DIAMOND 0.9994 0.0016 0.9992 0.9996 Bacteria
dbCAN_2:Hotpep 0.9981 0.0051 0.9975 0.9987 Bacteria
dbCAN_3 0.9995 0.0015 0.9993 0.9997 Bacteria
dbCAN_3:HMMER 0.9992 0.0027 0.9989 0.9995 Bacteria
dbCAN_3:DIAMOND 0.9996 0.0012 0.9994 0.9997 Bacteria
dbCAN_3:eCAMI 0.9988 0.0027 0.9985 0.9991 Bacteria
dbCAN_4 0.9996 0.0013 0.9994 0.9997 Bacteria
dbCAN_4:HMMER 0.9992 0.0026 0.9989 0.9996 Bacteria
dbCAN_4:DIAMOND 0.9996 0.0012 0.9994 0.9997 Bacteria
dbCAN_4:dbCAN-sub 0.9996 0.0013 0.9994 0.9997 Bacteria
CUPP 0.9989 0.0030 0.9985 0.9993 Bacteria
dbCAN_2 0.9991 0.0019 0.9989 0.9994 Eukaryote
dbCAN_2:HMMER 0.9990 0.0027 0.9986 0.9993 Eukaryote
dbCAN_2:DIAMOND 0.9993 0.0012 0.9992 0.9995 Eukaryote
dbCAN_2:Hotpep 0.9984 0.0032 0.9979 0.9988 Eukaryote
dbCAN_3 0.9995 0.0012 0.9993 0.9996 Eukaryote
dbCAN_3:HMMER 0.9990 0.0026 0.9987 0.9994 Eukaryote
dbCAN_3:DIAMOND 0.9996 0.0007 0.9995 0.9997 Eukaryote
dbCAN_3:eCAMI 0.9988 0.0020 0.9985 0.9990 Eukaryote
dbCAN_4 0.9994 0.0019 0.9991 0.9997 Eukaryote
dbCAN_4:HMMER 0.9990 0.0026 0.9987 0.9994 Eukaryote
dbCAN_4:DIAMOND 0.9997 0.0007 0.9996 0.9998 Eukaryote
dbCAN_4:dbCAN-sub 0.9994 0.0020 0.9991 0.9996 Eukaryote
CUPP 0.9988 0.0028 0.9984 0.9991 Eukaryote

11.2 Per CAZy class

11.2.1 Specificity

Table 11.6: Overall performance (represented by the Specificity) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9999 0.0004 0.9999 1.0000 All GH
dbCAN_2:HMMER 1.0000 0.0002 0.9999 1.0000 All GH
dbCAN_2:DIAMOND 0.9999 0.0005 0.9998 1.0000 All GH
dbCAN_2:Hotpep 0.9997 0.0011 0.9995 0.9999 All GH
dbCAN_3 0.9999 0.0003 0.9999 1.0000 All GH
dbCAN_3:HMMER 1.0000 0.0002 0.9999 1.0000 All GH
dbCAN_3:DIAMOND 0.9999 0.0005 0.9998 1.0000 All GH
dbCAN_3:eCAMI 0.9998 0.0009 0.9996 0.9999 All GH
dbCAN_4 1.0000 0.0002 0.9999 1.0000 All GH
dbCAN_4:HMMER 1.0000 0.0002 0.9999 1.0000 All GH
dbCAN_4:DIAMOND 0.9999 0.0004 0.9998 1.0000 All GH
dbCAN_4:dbCAN-sub 0.9999 0.0004 0.9999 1.0000 All GH
CUPP 0.9999 0.0004 0.9999 1.0000 All GH
dbCAN_2 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_2:HMMER 0.9999 0.0003 0.9998 0.9999 Bacteria GH
dbCAN_2:DIAMOND 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_2:Hotpep 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_3 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_3:HMMER 0.9999 0.0003 0.9998 1.0000 Bacteria GH
dbCAN_3:DIAMOND 0.9999 0.0003 0.9998 0.9999 Bacteria GH
dbCAN_3:eCAMI 0.9999 0.0003 0.9999 1.0000 Bacteria GH
dbCAN_4 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_4:HMMER 0.9999 0.0003 0.9998 1.0000 Bacteria GH
dbCAN_4:DIAMOND 0.9999 0.0003 0.9998 0.9999 Bacteria GH
dbCAN_4:dbCAN-sub 0.9999 0.0002 0.9999 1.0000 Bacteria GH
CUPP 0.9999 0.0002 0.9999 1.0000 Bacteria GH
dbCAN_2 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
dbCAN_2:HMMER 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
dbCAN_2:DIAMOND 0.9999 0.0003 0.9999 1.0000 Eukaryote GH
dbCAN_2:Hotpep 0.9998 0.0006 0.9998 0.9999 Eukaryote GH
dbCAN_3 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
dbCAN_3:HMMER 0.9999 0.0001 0.9999 1.0000 Eukaryote GH
dbCAN_3:DIAMOND 0.9999 0.0003 0.9999 1.0000 Eukaryote GH
dbCAN_3:eCAMI 0.9999 0.0004 0.9998 1.0000 Eukaryote GH
dbCAN_4 1.0000 0.0001 0.9999 1.0000 Eukaryote GH
dbCAN_4:HMMER 0.9999 0.0001 0.9999 1.0000 Eukaryote GH
dbCAN_4:DIAMOND 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
CUPP 0.9999 0.0002 0.9999 1.0000 Eukaryote GH
dbCAN_2 0.9999 0.0003 0.9998 1.0000 All GT
dbCAN_2:HMMER 0.9999 0.0003 0.9998 1.0000 All GT
dbCAN_2:DIAMOND 0.9999 0.0005 0.9997 1.0000 All GT
dbCAN_2:Hotpep 0.9999 0.0004 0.9997 1.0000 All GT
dbCAN_3 0.9999 0.0003 0.9998 1.0000 All GT
dbCAN_3:HMMER 0.9999 0.0002 0.9998 1.0000 All GT
dbCAN_3:DIAMOND 0.9999 0.0005 0.9997 1.0000 All GT
dbCAN_3:eCAMI 0.9999 0.0005 0.9997 1.0000 All GT
dbCAN_4 0.9999 0.0004 0.9997 1.0000 All GT
dbCAN_4:HMMER 0.9999 0.0002 0.9998 1.0000 All GT
dbCAN_4:DIAMOND 0.9999 0.0005 0.9997 1.0000 All GT
dbCAN_4:dbCAN-sub 0.9998 0.0006 0.9996 1.0000 All GT
CUPP 0.9999 0.0003 0.9998 1.0000 All GT
dbCAN_2 0.9998 0.0004 0.9997 0.9999 Bacteria GT
dbCAN_2:HMMER 0.9997 0.0006 0.9995 0.9999 Bacteria GT
dbCAN_2:DIAMOND 0.9998 0.0005 0.9997 0.9999 Bacteria GT
dbCAN_2:Hotpep 0.9998 0.0003 0.9997 0.9999 Bacteria GT
dbCAN_3 0.9998 0.0005 0.9997 0.9999 Bacteria GT
dbCAN_3:HMMER 0.9997 0.0006 0.9995 0.9999 Bacteria GT
dbCAN_3:DIAMOND 0.9997 0.0006 0.9996 0.9999 Bacteria GT
dbCAN_3:eCAMI 0.9998 0.0004 0.9997 0.9999 Bacteria GT
dbCAN_4 0.9997 0.0005 0.9996 0.9999 Bacteria GT
dbCAN_4:HMMER 0.9997 0.0006 0.9995 0.9999 Bacteria GT
dbCAN_4:DIAMOND 0.9997 0.0006 0.9996 0.9999 Bacteria GT
dbCAN_4:dbCAN-sub 0.9997 0.0005 0.9996 0.9999 Bacteria GT
CUPP 0.9998 0.0004 0.9997 0.9999 Bacteria GT
dbCAN_2 0.9999 0.0002 0.9999 1.0000 Eukaryote GT
dbCAN_2:HMMER 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_2:DIAMOND 0.9999 0.0003 0.9998 1.0000 Eukaryote GT
dbCAN_2:Hotpep 0.9999 0.0002 0.9998 1.0000 Eukaryote GT
dbCAN_3 0.9999 0.0002 0.9998 1.0000 Eukaryote GT
dbCAN_3:HMMER 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_3:DIAMOND 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_3:eCAMI 0.9999 0.0002 0.9998 1.0000 Eukaryote GT
dbCAN_4 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_4:HMMER 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_4:DIAMOND 0.9999 0.0003 0.9998 0.9999 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9998 0.0003 0.9998 0.9999 Eukaryote GT
CUPP 0.9999 0.0002 0.9999 1.0000 Eukaryote GT
dbCAN_2 1.0000 0.0000 1.0000 1.0000 All PL
dbCAN_2:HMMER 1.0000 0.0001 1.0000 1.0000 All PL
dbCAN_2:DIAMOND 1.0000 0.0001 1.0000 1.0000 All PL
dbCAN_2:Hotpep 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_3 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_3:HMMER 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_3:DIAMOND 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_3:eCAMI 1.0000 0.0002 0.9999 1.0000 All PL
dbCAN_4 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_4:HMMER 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_4:DIAMOND 1.0000 0.0001 0.9999 1.0000 All PL
dbCAN_4:dbCAN-sub 1.0000 0.0001 0.9999 1.0000 All PL
CUPP 1.0000 0.0000 1.0000 1.0000 All PL
dbCAN_2 1.0000 0.0001 0.9999 1.0000 Bacteria PL
dbCAN_2:HMMER 1.0000 0.0001 0.9999 1.0000 Bacteria PL
dbCAN_2:DIAMOND 1.0000 0.0001 0.9999 1.0000 Bacteria PL
dbCAN_2:Hotpep 1.0000 0.0001 0.9999 1.0000 Bacteria PL
dbCAN_3 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_3:HMMER 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_3:DIAMOND 0.9999 0.0002 0.9997 1.0000 Bacteria PL
dbCAN_3:eCAMI 1.0000 0.0001 0.9999 1.0000 Bacteria PL
dbCAN_4 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_4:HMMER 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_4:DIAMOND 0.9999 0.0002 0.9997 1.0000 Bacteria PL
dbCAN_4:dbCAN-sub 0.9999 0.0001 0.9998 1.0000 Bacteria PL
CUPP 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_2 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_2:HMMER 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_2:DIAMOND 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_2:Hotpep 1.0000 0.0001 1.0000 1.0000 Eukaryote PL
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_3:DIAMOND 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_3:eCAMI 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_4:DIAMOND 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
CUPP 1.0000 0.0000 1.0000 1.0000 Eukaryote PL
dbCAN_2 0.9990 0.0021 0.9979 1.0002 All CE
dbCAN_2:HMMER 0.9992 0.0018 0.9982 1.0002 All CE
dbCAN_2:DIAMOND 0.9991 0.0020 0.9979 1.0002 All CE
dbCAN_2:Hotpep 0.9987 0.0022 0.9974 0.9999 All CE
dbCAN_3 0.9990 0.0021 0.9978 1.0002 All CE
dbCAN_3:HMMER 0.9992 0.0019 0.9982 1.0003 All CE
dbCAN_3:DIAMOND 0.9992 0.0017 0.9983 1.0001 All CE
dbCAN_3:eCAMI 0.9989 0.0021 0.9977 1.0000 All CE
dbCAN_4 0.9993 0.0019 0.9982 1.0003 All CE
dbCAN_4:HMMER 0.9992 0.0019 0.9982 1.0003 All CE
dbCAN_4:DIAMOND 0.9993 0.0017 0.9984 1.0002 All CE
dbCAN_4:dbCAN-sub 0.9993 0.0019 0.9982 1.0003 All CE
CUPP 0.9993 0.0019 0.9982 1.0003 All CE
dbCAN_2 0.9999 0.0002 0.9997 1.0000 Bacteria CE
dbCAN_2:HMMER 0.9997 0.0004 0.9995 1.0000 Bacteria CE
dbCAN_2:DIAMOND 0.9999 0.0002 0.9998 1.0000 Bacteria CE
dbCAN_2:Hotpep 0.9999 0.0002 0.9998 1.0000 Bacteria CE
dbCAN_3 0.9999 0.0002 0.9997 1.0000 Bacteria CE
dbCAN_3:HMMER 0.9997 0.0004 0.9995 1.0000 Bacteria CE
dbCAN_3:DIAMOND 0.9996 0.0008 0.9991 1.0001 Bacteria CE
dbCAN_3:eCAMI 0.9999 0.0002 0.9998 1.0000 Bacteria CE
dbCAN_4 0.9998 0.0004 0.9996 1.0000 Bacteria CE
dbCAN_4:HMMER 0.9997 0.0004 0.9995 1.0000 Bacteria CE
dbCAN_4:DIAMOND 0.9996 0.0008 0.9991 1.0001 Bacteria CE
dbCAN_4:dbCAN-sub 0.9998 0.0004 0.9995 1.0000 Bacteria CE
CUPP 0.9998 0.0004 0.9996 1.0001 Bacteria CE
dbCAN_2 0.9995 0.0010 0.9990 1.0000 Eukaryote CE
dbCAN_2:HMMER 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
dbCAN_2:DIAMOND 0.9996 0.0010 0.9991 1.0001 Eukaryote CE
dbCAN_2:Hotpep 0.9994 0.0011 0.9988 0.9999 Eukaryote CE
dbCAN_3 0.9995 0.0010 0.9990 1.0001 Eukaryote CE
dbCAN_3:HMMER 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
dbCAN_3:DIAMOND 0.9995 0.0008 0.9991 0.9999 Eukaryote CE
dbCAN_3:eCAMI 0.9995 0.0010 0.9990 1.0000 Eukaryote CE
dbCAN_4 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
dbCAN_4:HMMER 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
dbCAN_4:DIAMOND 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9996 0.0009 0.9991 1.0000 Eukaryote CE
CUPP 0.9996 0.0009 0.9991 1.0001 Eukaryote CE
dbCAN_2 1.0000 NA NaN NaN All AA
dbCAN_2:HMMER 1.0000 NA NaN NaN All AA
dbCAN_2:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_2:Hotpep 1.0000 NA NaN NaN All AA
dbCAN_3 1.0000 NA NaN NaN All AA
dbCAN_3:HMMER 1.0000 NA NaN NaN All AA
dbCAN_3:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_3:eCAMI 1.0000 NA NaN NaN All AA
dbCAN_4 1.0000 NA NaN NaN All AA
dbCAN_4:HMMER 1.0000 NA NaN NaN All AA
dbCAN_4:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_4:dbCAN-sub 1.0000 NA NaN NaN All AA
CUPP 1.0000 NA NaN NaN All AA
dbCAN_2 0.9994 0.0011 0.9988 1.0000 Bacteria AA
dbCAN_2:HMMER 0.9993 0.0012 0.9986 0.9999 Bacteria AA
dbCAN_2:DIAMOND 0.9994 0.0013 0.9987 1.0001 Bacteria AA
dbCAN_2:Hotpep 0.9993 0.0013 0.9986 1.0000 Bacteria AA
dbCAN_3 0.9993 0.0013 0.9986 1.0000 Bacteria AA
dbCAN_3:HMMER 0.9992 0.0013 0.9985 0.9999 Bacteria AA
dbCAN_3:DIAMOND 0.9991 0.0013 0.9984 0.9998 Bacteria AA
dbCAN_3:eCAMI 0.9994 0.0013 0.9987 1.0001 Bacteria AA
dbCAN_4 0.9992 0.0013 0.9985 1.0000 Bacteria AA
dbCAN_4:HMMER 0.9992 0.0013 0.9985 0.9999 Bacteria AA
dbCAN_4:DIAMOND 0.9992 0.0013 0.9985 0.9999 Bacteria AA
dbCAN_4:dbCAN-sub 0.9992 0.0014 0.9984 0.9999 Bacteria AA
CUPP 0.9993 0.0014 0.9985 1.0001 Bacteria AA
dbCAN_2 0.9997 0.0006 0.9994 1.0000 Eukaryote AA
dbCAN_2:HMMER 0.9997 0.0006 0.9994 1.0000 Eukaryote AA
dbCAN_2:DIAMOND 0.9997 0.0006 0.9994 1.0000 Eukaryote AA
dbCAN_2:Hotpep 0.9997 0.0006 0.9993 1.0000 Eukaryote AA
dbCAN_3 0.9997 0.0006 0.9993 1.0000 Eukaryote AA
dbCAN_3:HMMER 0.9996 0.0007 0.9993 1.0000 Eukaryote AA
dbCAN_3:DIAMOND 0.9996 0.0006 0.9993 0.9999 Eukaryote AA
dbCAN_3:eCAMI 0.9997 0.0006 0.9994 1.0000 Eukaryote AA
dbCAN_4 0.9996 0.0007 0.9993 1.0000 Eukaryote AA
dbCAN_4:HMMER 0.9996 0.0007 0.9993 1.0000 Eukaryote AA
dbCAN_4:DIAMOND 0.9996 0.0006 0.9993 0.9999 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9996 0.0007 0.9993 1.0000 Eukaryote AA
CUPP 0.9997 0.0007 0.9993 1.0000 Eukaryote AA
dbCAN_2 0.9999 0.0004 0.9997 1.0000 All CBM
dbCAN_2:HMMER 0.9998 0.0005 0.9997 1.0000 All CBM
dbCAN_2:DIAMOND 0.9999 0.0003 0.9998 0.9999 All CBM
dbCAN_2:Hotpep 0.9964 0.0055 0.9950 0.9978 All CBM
dbCAN_3 0.9999 0.0002 0.9998 1.0000 All CBM
dbCAN_3:HMMER 0.9998 0.0005 0.9997 1.0000 All CBM
dbCAN_3:DIAMOND 0.9999 0.0002 0.9998 0.9999 All CBM
dbCAN_3:eCAMI 0.9983 0.0031 0.9975 0.9991 All CBM
dbCAN_4 0.9998 0.0005 0.9997 1.0000 All CBM
dbCAN_4:HMMER 0.9998 0.0005 0.9997 1.0000 All CBM
dbCAN_4:DIAMOND 0.9999 0.0002 0.9998 1.0000 All CBM
dbCAN_4:dbCAN-sub 0.9998 0.0005 0.9996 1.0000 All CBM
CUPP 1.0000 0.0000 1.0000 1.0000 All CBM
dbCAN_2 0.9996 0.0006 0.9994 0.9999 Bacteria CBM
dbCAN_2:HMMER 0.9999 0.0005 0.9997 1.0001 Bacteria CBM
dbCAN_2:DIAMOND 0.9995 0.0008 0.9991 0.9998 Bacteria CBM
dbCAN_2:Hotpep 0.9975 0.0045 0.9961 0.9988 Bacteria CBM
dbCAN_3 0.9998 0.0003 0.9996 0.9999 Bacteria CBM
dbCAN_3:HMMER 0.9999 0.0005 0.9997 1.0001 Bacteria CBM
dbCAN_3:DIAMOND 0.9997 0.0005 0.9995 0.9999 Bacteria CBM
dbCAN_3:eCAMI 0.9988 0.0025 0.9980 0.9997 Bacteria CBM
dbCAN_4 0.9998 0.0005 0.9996 1.0000 Bacteria CBM
dbCAN_4:HMMER 0.9999 0.0005 0.9997 1.0001 Bacteria CBM
dbCAN_4:DIAMOND 0.9997 0.0004 0.9996 0.9999 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9997 0.0009 0.9993 1.0001 Bacteria CBM
CUPP 1.0000 0.0000 1.0000 1.0000 Bacteria CBM
dbCAN_2 0.9999 0.0003 0.9998 0.9999 Eukaryote CBM
dbCAN_2:HMMER 0.9999 0.0004 0.9998 1.0000 Eukaryote CBM
dbCAN_2:DIAMOND 0.9998 0.0003 0.9997 0.9999 Eukaryote CBM
dbCAN_2:Hotpep 0.9976 0.0039 0.9966 0.9985 Eukaryote CBM
dbCAN_3 0.9999 0.0002 0.9999 1.0000 Eukaryote CBM
dbCAN_3:HMMER 0.9999 0.0004 0.9998 1.0000 Eukaryote CBM
dbCAN_3:DIAMOND 0.9999 0.0002 0.9998 0.9999 Eukaryote CBM
dbCAN_3:eCAMI 0.9989 0.0018 0.9985 0.9994 Eukaryote CBM
dbCAN_4 0.9999 0.0004 0.9998 1.0000 Eukaryote CBM
dbCAN_4:HMMER 0.9999 0.0004 0.9998 1.0000 Eukaryote CBM
dbCAN_4:DIAMOND 0.9999 0.0002 0.9998 0.9999 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.9998 0.0005 0.9997 1.0000 Eukaryote CBM
CUPP 1.0000 0.0000 1.0000 1.0000 Eukaryote CBM

11.2.2 Sensitivity

Table 11.7: Overall performance (represented by the Sensitivity) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.8604 0.2969 0.8072 0.9136 All GH
dbCAN_2:HMMER 0.8358 0.3324 0.7772 0.8944 All GH
dbCAN_2:DIAMOND 0.8539 0.3047 0.7993 0.9085 All GH
dbCAN_2:Hotpep 0.8035 0.3411 0.7426 0.8643 All GH
dbCAN_3 0.9128 0.2449 0.8687 0.9569 All GH
dbCAN_3:HMMER 0.8679 0.2977 0.8152 0.9206 All GH
dbCAN_3:DIAMOND 0.9271 0.2264 0.8864 0.9679 All GH
dbCAN_3:eCAMI 0.7584 0.3660 0.6928 0.8240 All GH
dbCAN_4 0.9037 0.2638 0.8566 0.9507 All GH
dbCAN_4:HMMER 0.8759 0.2875 0.8250 0.9268 All GH
dbCAN_4:DIAMOND 0.9336 0.2096 0.8959 0.9714 All GH
dbCAN_4:dbCAN-sub 0.9001 0.2661 0.8526 0.9476 All GH
CUPP 0.7845 0.3768 0.7167 0.8523 All GH
dbCAN_2 0.8866 0.2244 0.8393 0.9339 Bacteria GH
dbCAN_2:HMMER 0.8967 0.2492 0.8442 0.9492 Bacteria GH
dbCAN_2:DIAMOND 0.8842 0.2382 0.8340 0.9344 Bacteria GH
dbCAN_2:Hotpep 0.7795 0.2974 0.7169 0.8422 Bacteria GH
dbCAN_3 0.9503 0.1252 0.9237 0.9768 Bacteria GH
dbCAN_3:HMMER 0.9137 0.2277 0.8655 0.9620 Bacteria GH
dbCAN_3:DIAMOND 0.9825 0.0680 0.9681 0.9969 Bacteria GH
dbCAN_3:eCAMI 0.7045 0.3419 0.6325 0.7765 Bacteria GH
dbCAN_4 0.9610 0.1591 0.9273 0.9947 Bacteria GH
dbCAN_4:HMMER 0.9137 0.2277 0.8655 0.9620 Bacteria GH
dbCAN_4:DIAMOND 0.9809 0.0713 0.9658 0.9960 Bacteria GH
dbCAN_4:dbCAN-sub 0.9496 0.1881 0.9100 0.9892 Bacteria GH
CUPP 0.8280 0.3351 0.7570 0.8990 Bacteria GH
dbCAN_2 0.8667 0.2703 0.8208 0.9125 Eukaryote GH
dbCAN_2:HMMER 0.8483 0.3135 0.7960 0.9007 Eukaryote GH
dbCAN_2:DIAMOND 0.8680 0.2700 0.8222 0.9138 Eukaryote GH
dbCAN_2:Hotpep 0.7820 0.3204 0.7279 0.8361 Eukaryote GH
dbCAN_3 0.9282 0.2007 0.8940 0.9623 Eukaryote GH
dbCAN_3:HMMER 0.8796 0.2785 0.8329 0.9263 Eukaryote GH
dbCAN_3:DIAMOND 0.9440 0.1931 0.9111 0.9768 Eukaryote GH
dbCAN_3:eCAMI 0.7257 0.3538 0.6659 0.7854 Eukaryote GH
dbCAN_4 0.9197 0.2355 0.8799 0.9595 Eukaryote GH
dbCAN_4:HMMER 0.8868 0.2684 0.8418 0.9318 Eukaryote GH
dbCAN_4:DIAMOND 0.9489 0.1753 0.9191 0.9787 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9096 0.2496 0.8676 0.9516 Eukaryote GH
CUPP 0.7845 0.3643 0.7225 0.8465 Eukaryote GH
dbCAN_2 0.8201 0.2969 0.7225 0.9177 All GT
dbCAN_2:HMMER 0.7106 0.3823 0.5914 0.8297 All GT
dbCAN_2:DIAMOND 0.8280 0.2941 0.7313 0.9247 All GT
dbCAN_2:Hotpep 0.8289 0.2830 0.7359 0.9220 All GT
dbCAN_3 0.8682 0.2355 0.7908 0.9456 All GT
dbCAN_3:HMMER 0.7582 0.3512 0.6488 0.8676 All GT
dbCAN_3:DIAMOND 0.8965 0.2161 0.8255 0.9676 All GT
dbCAN_3:eCAMI 0.8053 0.3150 0.7032 0.9074 All GT
dbCAN_4 0.8530 0.2927 0.7581 0.9478 All GT
dbCAN_4:HMMER 0.7406 0.3657 0.6280 0.8531 All GT
dbCAN_4:DIAMOND 0.8909 0.2160 0.8199 0.9619 All GT
dbCAN_4:dbCAN-sub 0.8523 0.2927 0.7574 0.9472 All GT
CUPP 0.7959 0.3361 0.6854 0.9063 All GT
dbCAN_2 0.8887 0.2231 0.8278 0.9496 Bacteria GT
dbCAN_2:HMMER 0.8848 0.2585 0.8143 0.9554 Bacteria GT
dbCAN_2:DIAMOND 0.9105 0.2020 0.8554 0.9656 Bacteria GT
dbCAN_2:Hotpep 0.7561 0.3199 0.6688 0.8435 Bacteria GT
dbCAN_3 0.9126 0.2203 0.8531 0.9722 Bacteria GT
dbCAN_3:HMMER 0.8778 0.2613 0.8072 0.9485 Bacteria GT
dbCAN_3:DIAMOND 0.9610 0.1533 0.9196 1.0024 Bacteria GT
dbCAN_3:eCAMI 0.7649 0.3030 0.6823 0.8476 Bacteria GT
dbCAN_4 0.9239 0.2370 0.8598 0.9880 Bacteria GT
dbCAN_4:HMMER 0.8776 0.2613 0.8070 0.9483 Bacteria GT
dbCAN_4:DIAMOND 0.9652 0.1518 0.9241 1.0062 Bacteria GT
dbCAN_4:dbCAN-sub 0.9247 0.2372 0.8606 0.9888 Bacteria GT
CUPP 0.8048 0.2757 0.7295 0.8800 Bacteria GT
dbCAN_2 0.8588 0.2654 0.7977 0.9199 Eukaryote GT
dbCAN_2:HMMER 0.8430 0.3035 0.7736 0.9123 Eukaryote GT
dbCAN_2:DIAMOND 0.8717 0.2586 0.8122 0.9312 Eukaryote GT
dbCAN_2:Hotpep 0.7796 0.3189 0.7062 0.8529 Eukaryote GT
dbCAN_3 0.9008 0.2261 0.8491 0.9524 Eukaryote GT
dbCAN_3:HMMER 0.8759 0.2546 0.8177 0.9341 Eukaryote GT
dbCAN_3:DIAMOND 0.9354 0.1866 0.8927 0.9780 Eukaryote GT
dbCAN_3:eCAMI 0.7749 0.3195 0.7019 0.8479 Eukaryote GT
dbCAN_4 0.9103 0.2389 0.8557 0.9649 Eukaryote GT
dbCAN_4:HMMER 0.8644 0.2718 0.8027 0.9261 Eukaryote GT
dbCAN_4:DIAMOND 0.9360 0.1867 0.8933 0.9787 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9105 0.2391 0.8558 0.9651 Eukaryote GT
CUPP 0.8039 0.2981 0.7354 0.8725 Eukaryote GT
dbCAN_2 0.8032 0.3523 0.6470 0.9594 All PL
dbCAN_2:HMMER 0.8286 0.3563 0.6706 0.9865 All PL
dbCAN_2:DIAMOND 0.8513 0.3503 0.6960 1.0066 All PL
dbCAN_2:Hotpep 0.6894 0.3717 0.5286 0.8501 All PL
dbCAN_3 0.9536 0.2130 0.8591 1.0480 All PL
dbCAN_3:HMMER 0.9195 0.2361 0.8148 1.0241 All PL
dbCAN_3:DIAMOND 0.9763 0.1065 0.9291 1.0235 All PL
dbCAN_3:eCAMI 0.6619 0.4030 0.4832 0.8405 All PL
dbCAN_4 0.9195 0.2361 0.8148 1.0241 All PL
dbCAN_4:HMMER 0.9195 0.2361 0.8148 1.0241 All PL
dbCAN_4:DIAMOND 0.9763 0.1065 0.9291 1.0235 All PL
dbCAN_4:dbCAN-sub 0.9205 0.2364 0.8157 1.0253 All PL
CUPP 0.6912 0.4207 0.5047 0.8777 All PL
dbCAN_2 0.7218 0.4491 0.3463 1.0973 Bacteria PL
dbCAN_2:HMMER 0.7500 0.4629 0.3630 1.1370 Bacteria PL
dbCAN_2:DIAMOND 0.6970 0.4448 0.3251 1.0688 Bacteria PL
dbCAN_2:Hotpep 0.6968 0.4368 0.3316 1.0620 Bacteria PL
dbCAN_3 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_3:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_3:DIAMOND 0.9886 0.0321 0.9618 1.0155 Bacteria PL
dbCAN_3:eCAMI 0.6645 0.4311 0.3041 1.0249 Bacteria PL
dbCAN_4 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_4:HMMER 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_4:DIAMOND 1.0000 0.0000 1.0000 1.0000 Bacteria PL
dbCAN_4:dbCAN-sub 1.0000 0.0000 1.0000 1.0000 Bacteria PL
CUPP 0.7500 0.4629 0.3630 1.1370 Bacteria PL
dbCAN_2 0.8076 0.3448 0.6585 0.9568 Eukaryote PL
dbCAN_2:HMMER 0.8363 0.3501 0.6849 0.9877 Eukaryote PL
dbCAN_2:DIAMOND 0.8477 0.3415 0.7000 0.9953 Eukaryote PL
dbCAN_2:Hotpep 0.6951 0.3647 0.5411 0.8491 Eukaryote PL
dbCAN_3 0.9558 0.2084 0.8657 1.0459 Eukaryote PL
dbCAN_3:HMMER 0.9232 0.2313 0.8232 1.0233 Eukaryote PL
dbCAN_3:DIAMOND 0.9752 0.1043 0.9301 1.0202 Eukaryote PL
dbCAN_3:eCAMI 0.6599 0.3942 0.4894 0.8303 Eukaryote PL
dbCAN_4 0.9232 0.2313 0.8232 1.0233 Eukaryote PL
dbCAN_4:HMMER 0.9232 0.2313 0.8232 1.0233 Eukaryote PL
dbCAN_4:DIAMOND 0.9776 0.1042 0.9325 1.0226 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9239 0.2315 0.8238 1.0240 Eukaryote PL
CUPP 0.7051 0.4164 0.5251 0.8852 Eukaryote PL
dbCAN_2 0.8647 0.3197 0.6877 1.0418 All CE
dbCAN_2:HMMER 0.7927 0.3954 0.5820 1.0034 All CE
dbCAN_2:DIAMOND 0.8291 0.2993 0.6633 0.9948 All CE
dbCAN_2:Hotpep 0.8543 0.3168 0.6788 1.0297 All CE
dbCAN_3 0.8612 0.3190 0.6846 1.0379 All CE
dbCAN_3:HMMER 0.8455 0.3459 0.6540 1.0371 All CE
dbCAN_3:DIAMOND 0.9106 0.2557 0.7690 1.0522 All CE
dbCAN_3:eCAMI 0.8006 0.3071 0.6306 0.9707 All CE
dbCAN_4 0.9213 0.2560 0.7796 1.0631 All CE
dbCAN_4:HMMER 0.9122 0.2560 0.7705 1.0540 All CE
dbCAN_4:DIAMOND 0.9562 0.0802 0.9118 1.0006 All CE
dbCAN_4:dbCAN-sub 0.9213 0.2560 0.7796 1.0631 All CE
CUPP 0.7847 0.4074 0.5591 1.0103 All CE
dbCAN_2 0.5921 0.4146 0.3625 0.8217 Bacteria CE
dbCAN_2:HMMER 0.6061 0.4871 0.3465 0.8656 Bacteria CE
dbCAN_2:DIAMOND 0.5569 0.3938 0.3295 0.7843 Bacteria CE
dbCAN_2:Hotpep 0.5075 0.3311 0.3164 0.6987 Bacteria CE
dbCAN_3 0.8002 0.3270 0.6026 0.9979 Bacteria CE
dbCAN_3:HMMER 0.7131 0.4478 0.4651 0.9611 Bacteria CE
dbCAN_3:DIAMOND 0.8198 0.2831 0.6488 0.9909 Bacteria CE
dbCAN_3:eCAMI 0.5145 0.3644 0.2943 0.7347 Bacteria CE
dbCAN_4 0.7919 0.4106 0.5646 1.0193 Bacteria CE
dbCAN_4:HMMER 0.7798 0.4066 0.5546 1.0050 Bacteria CE
dbCAN_4:DIAMOND 0.9729 0.0664 0.9328 1.0130 Bacteria CE
dbCAN_4:dbCAN-sub 0.7939 0.4116 0.5660 1.0219 Bacteria CE
CUPP 0.7500 0.4294 0.4905 1.0094 Bacteria CE
dbCAN_2 0.7329 0.3820 0.5429 0.9228 Eukaryote CE
dbCAN_2:HMMER 0.7161 0.4409 0.5036 0.9286 Eukaryote CE
dbCAN_2:DIAMOND 0.6889 0.3601 0.5098 0.8679 Eukaryote CE
dbCAN_2:Hotpep 0.7114 0.3473 0.5328 0.8899 Eukaryote CE
dbCAN_3 0.8505 0.2923 0.7002 1.0008 Eukaryote CE
dbCAN_3:HMMER 0.8114 0.3750 0.6249 0.9979 Eukaryote CE
dbCAN_3:DIAMOND 0.8806 0.2384 0.7580 1.0032 Eukaryote CE
dbCAN_3:eCAMI 0.6539 0.3210 0.4888 0.8190 Eukaryote CE
dbCAN_4 0.8773 0.3199 0.7182 1.0364 Eukaryote CE
dbCAN_4:HMMER 0.8670 0.3173 0.7091 1.0248 Eukaryote CE
dbCAN_4:DIAMOND 0.9706 0.0472 0.9463 0.9949 Eukaryote CE
dbCAN_4:dbCAN-sub 0.8787 0.3204 0.7193 1.0380 Eukaryote CE
CUPP 0.7414 0.4256 0.5226 0.9602 Eukaryote CE
dbCAN_2 1.0000 NA NaN NaN All AA
dbCAN_2:HMMER 1.0000 NA NaN NaN All AA
dbCAN_2:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_2:Hotpep 0.9375 NA NaN NaN All AA
dbCAN_3 1.0000 NA NaN NaN All AA
dbCAN_3:HMMER 1.0000 NA NaN NaN All AA
dbCAN_3:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_3:eCAMI 0.9375 NA NaN NaN All AA
dbCAN_4 1.0000 NA NaN NaN All AA
dbCAN_4:HMMER 1.0000 NA NaN NaN All AA
dbCAN_4:DIAMOND 0.9375 NA NaN NaN All AA
dbCAN_4:dbCAN-sub 1.0000 NA NaN NaN All AA
CUPP 1.0000 NA NaN NaN All AA
dbCAN_2 0.7553 0.3381 0.5681 0.9425 Bacteria AA
dbCAN_2:HMMER 0.8549 0.2725 0.7040 1.0058 Bacteria AA
dbCAN_2:DIAMOND 0.7601 0.2836 0.6030 0.9171 Bacteria AA
dbCAN_2:Hotpep 0.6953 0.3780 0.4860 0.9046 Bacteria AA
dbCAN_3 0.9085 0.2572 0.7661 1.0510 Bacteria AA
dbCAN_3:HMMER 0.9024 0.2643 0.7560 1.0488 Bacteria AA
dbCAN_3:DIAMOND 0.9485 0.1109 0.8870 1.0099 Bacteria AA
dbCAN_3:eCAMI 0.6247 0.4025 0.4018 0.8476 Bacteria AA
dbCAN_4 0.9042 0.2648 0.7575 1.0508 Bacteria AA
dbCAN_4:HMMER 0.9024 0.2643 0.7560 1.0488 Bacteria AA
dbCAN_4:DIAMOND 0.9580 0.1073 0.8986 1.0174 Bacteria AA
dbCAN_4:dbCAN-sub 0.9033 0.2645 0.7568 1.0498 Bacteria AA
CUPP 0.6737 0.4459 0.4267 0.9207 Bacteria AA
dbCAN_2 0.7706 0.3323 0.5935 0.9476 Eukaryote AA
dbCAN_2:HMMER 0.8640 0.2657 0.7224 1.0056 Eukaryote AA
dbCAN_2:DIAMOND 0.7751 0.2804 0.6256 0.9245 Eukaryote AA
dbCAN_2:Hotpep 0.7104 0.3702 0.5132 0.9077 Eukaryote AA
dbCAN_3 0.9142 0.2496 0.7813 1.0472 Eukaryote AA
dbCAN_3:HMMER 0.9085 0.2565 0.7718 1.0452 Eukaryote AA
dbCAN_3:DIAMOND 0.9517 0.1079 0.8942 1.0092 Eukaryote AA
dbCAN_3:eCAMI 0.6443 0.3966 0.4329 0.8556 Eukaryote AA
dbCAN_4 0.9102 0.2570 0.7732 1.0471 Eukaryote AA
dbCAN_4:HMMER 0.9085 0.2565 0.7718 1.0452 Eukaryote AA
dbCAN_4:DIAMOND 0.9567 0.1038 0.9014 1.0120 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9093 0.2567 0.7725 1.0461 Eukaryote AA
CUPP 0.6941 0.4385 0.4604 0.9277 Eukaryote AA
dbCAN_2 0.8148 0.3268 0.7177 0.9118 All CBM
dbCAN_2:HMMER 0.7330 0.4028 0.6148 0.8513 All CBM
dbCAN_2:DIAMOND 0.8270 0.3239 0.7309 0.9232 All CBM
dbCAN_2:Hotpep 0.5425 0.4255 0.4335 0.6515 All CBM
dbCAN_3 0.8306 0.3297 0.7327 0.9285 All CBM
dbCAN_3:HMMER 0.7330 0.4028 0.6148 0.8513 All CBM
dbCAN_3:DIAMOND 0.8566 0.3249 0.7601 0.9531 All CBM
dbCAN_3:eCAMI 0.5937 0.4091 0.4842 0.7033 All CBM
dbCAN_4 0.8413 0.3390 0.7417 0.9408 All CBM
dbCAN_4:HMMER 0.7543 0.3894 0.6400 0.8687 All CBM
dbCAN_4:DIAMOND 0.8531 0.3242 0.7569 0.9494 All CBM
dbCAN_4:dbCAN-sub 0.8445 0.3353 0.7461 0.9429 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.7128 0.3945 0.5499 0.8756 Bacteria CBM
dbCAN_2:HMMER 0.5236 0.4599 0.3338 0.7135 Bacteria CBM
dbCAN_2:DIAMOND 0.7543 0.3277 0.6190 0.8895 Bacteria CBM
dbCAN_2:Hotpep 0.3607 0.4477 0.2292 0.4921 Bacteria CBM
dbCAN_3 0.7928 0.3480 0.6491 0.9364 Bacteria CBM
dbCAN_3:HMMER 0.5636 0.4560 0.3754 0.7518 Bacteria CBM
dbCAN_3:DIAMOND 0.9176 0.2217 0.8240 1.0112 Bacteria CBM
dbCAN_3:eCAMI 0.4660 0.4568 0.3115 0.6206 Bacteria CBM
dbCAN_4 0.8158 0.3093 0.6882 0.9435 Bacteria CBM
dbCAN_4:HMMER 0.5944 0.4421 0.4119 0.7769 Bacteria CBM
dbCAN_4:DIAMOND 0.9588 0.1566 0.8927 1.0250 Bacteria CBM
dbCAN_4:dbCAN-sub 0.7481 0.3526 0.6025 0.8936 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.8099 0.3219 0.7228 0.8969 Eukaryote CBM
dbCAN_2:HMMER 0.6965 0.4123 0.5861 0.8069 Eukaryote CBM
dbCAN_2:DIAMOND 0.8203 0.3169 0.7346 0.9060 Eukaryote CBM
dbCAN_2:Hotpep 0.5879 0.4090 0.4896 0.6861 Eukaryote CBM
dbCAN_3 0.8467 0.3046 0.7643 0.9290 Eukaryote CBM
dbCAN_3:HMMER 0.7143 0.4032 0.6064 0.8223 Eukaryote CBM
dbCAN_3:DIAMOND 0.8772 0.2968 0.7969 0.9574 Eukaryote CBM
dbCAN_3:eCAMI 0.6048 0.3972 0.5064 0.7033 Eukaryote CBM
dbCAN_4 0.8332 0.3345 0.7436 0.9228 Eukaryote CBM
dbCAN_4:HMMER 0.7311 0.3923 0.6261 0.8362 Eukaryote CBM
dbCAN_4:DIAMOND 0.8811 0.2943 0.8016 0.9607 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.8012 0.3530 0.7066 0.8957 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

11.2.3 Precision

Table 11.8: Overall performance (represented by the Precision) of CAZy family classification by family classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9031 0.2919 0.8507 0.9554 All GH
dbCAN_2:HMMER 0.8598 0.3354 0.8006 0.9189 All GH
dbCAN_2:DIAMOND 0.8848 0.3037 0.8303 0.9392 All GH
dbCAN_2:Hotpep 0.8395 0.3418 0.7784 0.9005 All GH
dbCAN_3 0.9336 0.2384 0.8907 0.9765 All GH
dbCAN_3:HMMER 0.8913 0.2986 0.8384 0.9442 All GH
dbCAN_3:DIAMOND 0.9307 0.2282 0.8896 0.9717 All GH
dbCAN_3:eCAMI 0.8137 0.3713 0.7472 0.8803 All GH
dbCAN_4 0.9123 0.2675 0.8646 0.9601 All GH
dbCAN_4:HMMER 0.8993 0.2878 0.8484 0.9503 All GH
dbCAN_4:DIAMOND 0.9395 0.2101 0.9017 0.9773 All GH
dbCAN_4:dbCAN-sub 0.9058 0.2703 0.8576 0.9541 All GH
CUPP 0.8186 0.3787 0.7504 0.8867 All GH
dbCAN_2 0.9508 0.1929 0.9102 0.9914 Bacteria GH
dbCAN_2:HMMER 0.9206 0.2424 0.8695 0.9716 Bacteria GH
dbCAN_2:DIAMOND 0.9487 0.1929 0.9080 0.9893 Bacteria GH
dbCAN_2:Hotpep 0.8796 0.3000 0.8164 0.9428 Bacteria GH
dbCAN_3 0.9845 0.0732 0.9689 1.0000 Bacteria GH
dbCAN_3:HMMER 0.9311 0.2227 0.8839 0.9782 Bacteria GH
dbCAN_3:DIAMOND 0.9768 0.0799 0.9599 0.9937 Bacteria GH
dbCAN_3:eCAMI 0.8400 0.3539 0.7654 0.9145 Bacteria GH
dbCAN_4 0.9582 0.1658 0.9230 0.9933 Bacteria GH
dbCAN_4:HMMER 0.9311 0.2227 0.8839 0.9782 Bacteria GH
dbCAN_4:DIAMOND 0.9767 0.0799 0.9598 0.9936 Bacteria GH
dbCAN_4:dbCAN-sub 0.9474 0.1936 0.9066 0.9882 Bacteria GH
CUPP 0.8541 0.3350 0.7831 0.9251 Bacteria GH
dbCAN_2 0.9190 0.2615 0.8747 0.9634 Eukaryote GH
dbCAN_2:HMMER 0.8746 0.3110 0.8227 0.9266 Eukaryote GH
dbCAN_2:DIAMOND 0.9125 0.2620 0.8681 0.9569 Eukaryote GH
dbCAN_2:Hotpep 0.8549 0.3239 0.8002 0.9097 Eukaryote GH
dbCAN_3 0.9533 0.1913 0.9207 0.9858 Eukaryote GH
dbCAN_3:HMMER 0.9030 0.2736 0.8571 0.9489 Eukaryote GH
dbCAN_3:DIAMOND 0.9443 0.1936 0.9114 0.9773 Eukaryote GH
dbCAN_3:eCAMI 0.8186 0.3655 0.7568 0.8803 Eukaryote GH
dbCAN_4 0.9245 0.2397 0.8840 0.9650 Eukaryote GH
dbCAN_4:HMMER 0.9102 0.2626 0.8662 0.9543 Eukaryote GH
dbCAN_4:DIAMOND 0.9509 0.1762 0.9209 0.9809 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9127 0.2535 0.8700 0.9553 Eukaryote GH
CUPP 0.8331 0.3615 0.7715 0.8946 Eukaryote GH
dbCAN_2 0.9115 0.2758 0.8208 1.0021 All GT
dbCAN_2:HMMER 0.7868 0.3936 0.6641 0.9094 All GT
dbCAN_2:DIAMOND 0.9121 0.2746 0.8218 1.0024 All GT
dbCAN_2:Hotpep 0.8949 0.2741 0.8048 0.9850 All GT
dbCAN_3 0.9640 0.1695 0.9083 1.0197 All GT
dbCAN_3:HMMER 0.8344 0.3530 0.7244 0.9444 All GT
dbCAN_3:DIAMOND 0.9616 0.1679 0.9064 1.0168 All GT
dbCAN_3:eCAMI 0.8819 0.3050 0.7830 0.9807 All GT
dbCAN_4 0.9009 0.2717 0.8128 0.9889 All GT
dbCAN_4:HMMER 0.8150 0.3713 0.7007 0.9292 All GT
dbCAN_4:DIAMOND 0.9616 0.1679 0.9064 1.0168 All GT
dbCAN_4:dbCAN-sub 0.8935 0.2719 0.8054 0.9817 All GT
CUPP 0.8606 0.3419 0.7482 0.9729 All GT
dbCAN_2 0.9269 0.2022 0.8717 0.9821 Bacteria GT
dbCAN_2:HMMER 0.8640 0.2654 0.7915 0.9364 Bacteria GT
dbCAN_2:DIAMOND 0.9254 0.2022 0.8702 0.9806 Bacteria GT
dbCAN_2:Hotpep 0.8647 0.2978 0.7834 0.9460 Bacteria GT
dbCAN_3 0.9100 0.2345 0.8466 0.9734 Bacteria GT
dbCAN_3:HMMER 0.8664 0.2635 0.7952 0.9377 Bacteria GT
dbCAN_3:DIAMOND 0.9235 0.1986 0.8698 0.9772 Bacteria GT
dbCAN_3:eCAMI 0.8855 0.2720 0.8113 0.9598 Bacteria GT
dbCAN_4 0.8798 0.2651 0.8081 0.9514 Bacteria GT
dbCAN_4:HMMER 0.8664 0.2635 0.7952 0.9377 Bacteria GT
dbCAN_4:DIAMOND 0.9246 0.1979 0.8711 0.9781 Bacteria GT
dbCAN_4:dbCAN-sub 0.8790 0.2648 0.8074 0.9506 Bacteria GT
CUPP 0.8792 0.2704 0.8054 0.9531 Bacteria GT
dbCAN_2 0.9099 0.2523 0.8519 0.9680 Eukaryote GT
dbCAN_2:HMMER 0.8431 0.3071 0.7729 0.9133 Eukaryote GT
dbCAN_2:DIAMOND 0.9095 0.2522 0.8515 0.9676 Eukaryote GT
dbCAN_2:Hotpep 0.8627 0.3094 0.7915 0.9339 Eukaryote GT
dbCAN_3 0.9240 0.2266 0.8722 0.9758 Eukaryote GT
dbCAN_3:HMMER 0.8760 0.2588 0.8168 0.9351 Eukaryote GT
dbCAN_3:DIAMOND 0.9356 0.1990 0.8901 0.9811 Eukaryote GT
dbCAN_3:eCAMI 0.8668 0.3066 0.7967 0.9368 Eukaryote GT
dbCAN_4 0.8953 0.2508 0.8380 0.9526 Eukaryote GT
dbCAN_4:HMMER 0.8646 0.2758 0.8020 0.9272 Eukaryote GT
dbCAN_4:DIAMOND 0.9360 0.1985 0.8907 0.9814 Eukaryote GT
dbCAN_4:dbCAN-sub 0.8940 0.2511 0.8366 0.9513 Eukaryote GT
CUPP 0.8785 0.2933 0.8110 0.9460 Eukaryote GT
dbCAN_2 0.8636 0.3513 0.7079 1.0194 All PL
dbCAN_2:HMMER 0.8409 0.3581 0.6821 0.9997 All PL
dbCAN_2:DIAMOND 0.8485 0.3523 0.6923 1.0047 All PL
dbCAN_2:Hotpep 0.8261 0.3876 0.6585 0.9937 All PL
dbCAN_3 0.9091 0.2505 0.7980 1.0202 All PL
dbCAN_3:HMMER 0.9091 0.2505 0.7980 1.0202 All PL
dbCAN_3:DIAMOND 0.9318 0.1756 0.8540 1.0097 All PL
dbCAN_3:eCAMI 0.7692 0.4273 0.5798 0.9587 All PL
dbCAN_4 0.9091 0.2505 0.7980 1.0202 All PL
dbCAN_4:HMMER 0.9091 0.2505 0.7980 1.0202 All PL
dbCAN_4:DIAMOND 0.9318 0.1756 0.8540 1.0097 All PL
dbCAN_4:dbCAN-sub 0.9091 0.2505 0.7980 1.0202 All PL
CUPP 0.7727 0.4289 0.5825 0.9629 All PL
dbCAN_2 0.7426 0.4588 0.3591 1.1262 Bacteria PL
dbCAN_2:HMMER 0.7431 0.4590 0.3593 1.1268 Bacteria PL
dbCAN_2:DIAMOND 0.7431 0.4590 0.3593 1.1268 Bacteria PL
dbCAN_2:Hotpep 0.7426 0.4588 0.3591 1.1262 Bacteria PL
dbCAN_3 0.9514 0.1167 0.8538 1.0489 Bacteria PL
dbCAN_3:HMMER 0.9514 0.1167 0.8538 1.0489 Bacteria PL
dbCAN_3:DIAMOND 0.9014 0.1675 0.7614 1.0414 Bacteria PL
dbCAN_3:eCAMI 0.7422 0.4586 0.3588 1.1256 Bacteria PL
dbCAN_4 0.9514 0.1167 0.8538 1.0489 Bacteria PL
dbCAN_4:HMMER 0.9514 0.1167 0.8538 1.0489 Bacteria PL
dbCAN_4:DIAMOND 0.9014 0.1675 0.7614 1.0414 Bacteria PL
dbCAN_4:dbCAN-sub 0.9514 0.1167 0.8538 1.0489 Bacteria PL
CUPP 0.7326 0.4533 0.3537 1.1116 Bacteria PL
dbCAN_2 0.8689 0.3441 0.7201 1.0176 Eukaryote PL
dbCAN_2:HMMER 0.8471 0.3512 0.6953 0.9990 Eukaryote PL
dbCAN_2:DIAMOND 0.8544 0.3453 0.7050 1.0037 Eukaryote PL
dbCAN_2:Hotpep 0.8326 0.3804 0.6720 0.9933 Eukaryote PL
dbCAN_3 0.9210 0.2293 0.8219 1.0202 Eukaryote PL
dbCAN_3:HMMER 0.9210 0.2293 0.8219 1.0202 Eukaryote PL
dbCAN_3:DIAMOND 0.9329 0.1638 0.8620 1.0037 Eukaryote PL
dbCAN_3:eCAMI 0.7785 0.4198 0.5969 0.9601 Eukaryote PL
dbCAN_4 0.9210 0.2293 0.8219 1.0202 Eukaryote PL
dbCAN_4:HMMER 0.9210 0.2293 0.8219 1.0202 Eukaryote PL
dbCAN_4:DIAMOND 0.9329 0.1638 0.8620 1.0037 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9211 0.2293 0.8219 1.0202 Eukaryote PL
CUPP 0.7796 0.4203 0.5979 0.9614 Eukaryote PL
dbCAN_2 0.7620 0.3500 0.5682 0.9558 All CE
dbCAN_2:HMMER 0.6935 0.4003 0.4802 0.9068 All CE
dbCAN_2:DIAMOND 0.8023 0.3373 0.6154 0.9891 All CE
dbCAN_2:Hotpep 0.7218 0.3409 0.5330 0.9106 All CE
dbCAN_3 0.7716 0.3480 0.5789 0.9643 All CE
dbCAN_3:HMMER 0.7397 0.3674 0.5363 0.9432 All CE
dbCAN_3:DIAMOND 0.8130 0.3027 0.6454 0.9806 All CE
dbCAN_3:eCAMI 0.7360 0.3466 0.5440 0.9280 All CE
dbCAN_4 0.8184 0.3054 0.6493 0.9875 All CE
dbCAN_4:HMMER 0.8113 0.3024 0.6438 0.9787 All CE
dbCAN_4:DIAMOND 0.8921 0.2058 0.7781 1.0061 All CE
dbCAN_4:dbCAN-sub 0.8184 0.3054 0.6493 0.9875 All CE
CUPP 0.7015 0.4142 0.4721 0.9309 All CE
dbCAN_2 0.6691 0.4593 0.4147 0.9234 Bacteria CE
dbCAN_2:HMMER 0.5558 0.4798 0.3002 0.8115 Bacteria CE
dbCAN_2:DIAMOND 0.7035 0.4627 0.4363 0.9706 Bacteria CE
dbCAN_2:Hotpep 0.7101 0.4225 0.4662 0.9540 Bacteria CE
dbCAN_3 0.8574 0.3090 0.6707 1.0442 Bacteria CE
dbCAN_3:HMMER 0.6596 0.4529 0.4088 0.9104 Bacteria CE
dbCAN_3:DIAMOND 0.8100 0.3346 0.6078 1.0122 Bacteria CE
dbCAN_3:eCAMI 0.7473 0.4292 0.4879 1.0066 Bacteria CE
dbCAN_4 0.7303 0.4233 0.4959 0.9647 Bacteria CE
dbCAN_4:HMMER 0.7262 0.4214 0.4929 0.9596 Bacteria CE
dbCAN_4:DIAMOND 0.8931 0.2307 0.7537 1.0325 Bacteria CE
dbCAN_4:dbCAN-sub 0.7285 0.4223 0.4947 0.9624 Bacteria CE
CUPP 0.6972 0.4468 0.4272 0.9672 Bacteria CE
dbCAN_2 0.6825 0.3922 0.4875 0.8775 Eukaryote CE
dbCAN_2:HMMER 0.6019 0.4201 0.3994 0.8044 Eukaryote CE
dbCAN_2:DIAMOND 0.7102 0.3901 0.5162 0.9042 Eukaryote CE
dbCAN_2:Hotpep 0.6974 0.3476 0.5186 0.8761 Eukaryote CE
dbCAN_3 0.8006 0.3203 0.6359 0.9653 Eukaryote CE
dbCAN_3:HMMER 0.6909 0.3810 0.5014 0.8804 Eukaryote CE
dbCAN_3:DIAMOND 0.8037 0.3014 0.6487 0.9586 Eukaryote CE
dbCAN_3:eCAMI 0.7082 0.3722 0.5168 0.8995 Eukaryote CE
dbCAN_4 0.7625 0.3416 0.5927 0.9324 Eukaryote CE
dbCAN_4:HMMER 0.7511 0.3412 0.5814 0.9208 Eukaryote CE
dbCAN_4:DIAMOND 0.8751 0.2163 0.7639 0.9863 Eukaryote CE
dbCAN_4:dbCAN-sub 0.7613 0.3408 0.5918 0.9307 Eukaryote CE
CUPP 0.6687 0.4313 0.4470 0.8905 Eukaryote CE
dbCAN_2 1.0000 NA NaN NaN All AA
dbCAN_2:HMMER 1.0000 NA NaN NaN All AA
dbCAN_2:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_2:Hotpep 1.0000 NA NaN NaN All AA
dbCAN_3 1.0000 NA NaN NaN All AA
dbCAN_3:HMMER 1.0000 NA NaN NaN All AA
dbCAN_3:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_3:eCAMI 1.0000 NA NaN NaN All AA
dbCAN_4 1.0000 NA NaN NaN All AA
dbCAN_4:HMMER 1.0000 NA NaN NaN All AA
dbCAN_4:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_4:dbCAN-sub 1.0000 NA NaN NaN All AA
CUPP 1.0000 NA NaN NaN All AA
dbCAN_2 0.7833 0.3445 0.5925 0.9741 Bacteria AA
dbCAN_2:HMMER 0.8137 0.2850 0.6558 0.9715 Bacteria AA
dbCAN_2:DIAMOND 0.8365 0.2643 0.6902 0.9828 Bacteria AA
dbCAN_2:Hotpep 0.7022 0.3980 0.4818 0.9226 Bacteria AA
dbCAN_3 0.8401 0.2627 0.6946 0.9856 Bacteria AA
dbCAN_3:HMMER 0.8210 0.2769 0.6677 0.9744 Bacteria AA
dbCAN_3:DIAMOND 0.8461 0.1717 0.7510 0.9412 Bacteria AA
dbCAN_3:eCAMI 0.6398 0.4184 0.4081 0.8715 Bacteria AA
dbCAN_4 0.8273 0.2732 0.6760 0.9786 Bacteria AA
dbCAN_4:HMMER 0.8210 0.2769 0.6677 0.9744 Bacteria AA
dbCAN_4:DIAMOND 0.8457 0.1809 0.7455 0.9459 Bacteria AA
dbCAN_4:dbCAN-sub 0.8238 0.2747 0.6717 0.9759 Bacteria AA
CUPP 0.6278 0.4369 0.3858 0.8698 Bacteria AA
dbCAN_2 0.7969 0.3372 0.6172 0.9765 Eukaryote AA
dbCAN_2:HMMER 0.8253 0.2793 0.6765 0.9741 Eukaryote AA
dbCAN_2:DIAMOND 0.8467 0.2585 0.7090 0.9845 Eukaryote AA
dbCAN_2:Hotpep 0.7208 0.3917 0.5121 0.9295 Eukaryote AA
dbCAN_3 0.8501 0.2569 0.7132 0.9870 Eukaryote AA
dbCAN_3:HMMER 0.8322 0.2712 0.6877 0.9767 Eukaryote AA
dbCAN_3:DIAMOND 0.8557 0.1703 0.7650 0.9464 Eukaryote AA
dbCAN_3:eCAMI 0.6623 0.4141 0.4416 0.8829 Eukaryote AA
dbCAN_4 0.8381 0.2675 0.6956 0.9806 Eukaryote AA
dbCAN_4:HMMER 0.8322 0.2712 0.6877 0.9767 Eukaryote AA
dbCAN_4:DIAMOND 0.8553 0.1790 0.7600 0.9507 Eukaryote AA
dbCAN_4:dbCAN-sub 0.8348 0.2690 0.6915 0.9781 Eukaryote AA
CUPP 0.6511 0.4322 0.4207 0.8814 Eukaryote AA
dbCAN_2 0.8246 0.3495 0.7208 0.9284 All CBM
dbCAN_2:HMMER 0.7683 0.4010 0.6505 0.8860 All CBM
dbCAN_2:DIAMOND 0.8082 0.3361 0.7084 0.9080 All CBM
dbCAN_2:Hotpep 0.2439 0.2880 0.1702 0.3177 All CBM
dbCAN_3 0.8462 0.3337 0.7471 0.9453 All CBM
dbCAN_3:HMMER 0.7683 0.4010 0.6505 0.8860 All CBM
dbCAN_3:DIAMOND 0.8357 0.3237 0.7396 0.9318 All CBM
dbCAN_3:eCAMI 0.4150 0.3652 0.3172 0.5128 All CBM
dbCAN_4 0.8194 0.3638 0.7125 0.9262 All CBM
dbCAN_4:HMMER 0.7895 0.3856 0.6763 0.9028 All CBM
dbCAN_4:DIAMOND 0.8369 0.3245 0.7405 0.9332 All CBM
dbCAN_4:dbCAN-sub 0.8138 0.3652 0.7065 0.9210 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.7155 0.3901 0.5544 0.8765 Bacteria CBM
dbCAN_2:HMMER 0.7175 0.4568 0.5289 0.9061 Bacteria CBM
dbCAN_2:DIAMOND 0.7844 0.3234 0.6509 0.9179 Bacteria CBM
dbCAN_2:Hotpep 0.2061 0.3147 0.1137 0.2985 Bacteria CBM
dbCAN_3 0.8270 0.3297 0.6909 0.9631 Bacteria CBM
dbCAN_3:HMMER 0.7575 0.4346 0.5781 0.9369 Bacteria CBM
dbCAN_3:DIAMOND 0.8972 0.2198 0.8044 0.9900 Bacteria CBM
dbCAN_3:eCAMI 0.3881 0.3964 0.2540 0.5223 Bacteria CBM
dbCAN_4 0.9374 0.2020 0.8540 1.0208 Bacteria CBM
dbCAN_4:HMMER 0.7908 0.4051 0.6236 0.9580 Bacteria CBM
dbCAN_4:DIAMOND 0.9409 0.1372 0.8830 0.9988 Bacteria CBM
dbCAN_4:dbCAN-sub 0.8766 0.2852 0.7589 0.9943 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.8126 0.3393 0.7208 0.9043 Eukaryote CBM
dbCAN_2:HMMER 0.7678 0.4055 0.6592 0.8764 Eukaryote CBM
dbCAN_2:DIAMOND 0.7979 0.3224 0.7108 0.8851 Eukaryote CBM
dbCAN_2:Hotpep 0.2788 0.3006 0.2066 0.3510 Eukaryote CBM
dbCAN_3 0.8552 0.3137 0.7704 0.9400 Eukaryote CBM
dbCAN_3:HMMER 0.7857 0.3929 0.6805 0.8909 Eukaryote CBM
dbCAN_3:DIAMOND 0.8490 0.2977 0.7685 0.9295 Eukaryote CBM
dbCAN_3:eCAMI 0.4285 0.3532 0.3410 0.5160 Eukaryote CBM
dbCAN_4 0.8451 0.3412 0.7537 0.9364 Eukaryote CBM
dbCAN_4:HMMER 0.8028 0.3786 0.7014 0.9042 Eukaryote CBM
dbCAN_4:DIAMOND 0.8676 0.2812 0.7916 0.9436 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.8212 0.3608 0.7246 0.9179 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

11.2.4 F1-score

Table 11.9: Overall performance (represented by the F1-score) of CAZy family classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.8778 0.2917 0.8255 0.9301 All GH
dbCAN_2:HMMER 0.8436 0.3298 0.7855 0.9017 All GH
dbCAN_2:DIAMOND 0.8641 0.2999 0.8104 0.9179 All GH
dbCAN_2:Hotpep 0.8139 0.3357 0.7540 0.8738 All GH
dbCAN_3 0.9211 0.2397 0.8779 0.9642 All GH
dbCAN_3:HMMER 0.8754 0.2940 0.8234 0.9275 All GH
dbCAN_3:DIAMOND 0.9259 0.2241 0.8855 0.9662 All GH
dbCAN_3:eCAMI 0.7778 0.3624 0.7128 0.8427 All GH
dbCAN_4 0.9047 0.2624 0.8579 0.9515 All GH
dbCAN_4:HMMER 0.8834 0.2834 0.8333 0.9336 All GH
dbCAN_4:DIAMOND 0.9340 0.2070 0.8968 0.9713 All GH
dbCAN_4:dbCAN-sub 0.8981 0.2645 0.8509 0.9453 All GH
CUPP 0.7968 0.3740 0.7295 0.8641 All GH
dbCAN_2 0.9102 0.2036 0.8673 0.9531 Bacteria GH
dbCAN_2:HMMER 0.9033 0.2411 0.8526 0.9541 Bacteria GH
dbCAN_2:DIAMOND 0.9045 0.2151 0.8592 0.9498 Bacteria GH
dbCAN_2:Hotpep 0.8156 0.2896 0.7546 0.8766 Bacteria GH
dbCAN_3 0.9621 0.0980 0.9413 0.9829 Bacteria GH
dbCAN_3:HMMER 0.9179 0.2212 0.8710 0.9648 Bacteria GH
dbCAN_3:DIAMOND 0.9784 0.0701 0.9635 0.9933 Bacteria GH
dbCAN_3:eCAMI 0.7551 0.3390 0.6837 0.8265 Bacteria GH
dbCAN_4 0.9589 0.1615 0.9247 0.9931 Bacteria GH
dbCAN_4:HMMER 0.9179 0.2212 0.8710 0.9648 Bacteria GH
dbCAN_4:DIAMOND 0.9774 0.0711 0.9623 0.9925 Bacteria GH
dbCAN_4:dbCAN-sub 0.9478 0.1900 0.9078 0.9878 Bacteria GH
CUPP 0.8374 0.3315 0.7672 0.9077 Bacteria GH
dbCAN_2 0.8881 0.2624 0.8436 0.9326 Eukaryote GH
dbCAN_2:HMMER 0.8554 0.3102 0.8036 0.9073 Eukaryote GH
dbCAN_2:DIAMOND 0.8853 0.2617 0.8409 0.9296 Eukaryote GH
dbCAN_2:Hotpep 0.8086 0.3148 0.7554 0.8618 Eukaryote GH
dbCAN_3 0.9383 0.1932 0.9054 0.9712 Eukaryote GH
dbCAN_3:HMMER 0.8855 0.2747 0.8394 0.9315 Eukaryote GH
dbCAN_3:DIAMOND 0.9427 0.1908 0.9102 0.9752 Eukaryote GH
dbCAN_3:eCAMI 0.7603 0.3526 0.7007 0.8199 Eukaryote GH
dbCAN_4 0.9200 0.2349 0.8804 0.9597 Eukaryote GH
dbCAN_4:HMMER 0.8927 0.2643 0.8483 0.9370 Eukaryote GH
dbCAN_4:DIAMOND 0.9485 0.1730 0.9190 0.9779 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9079 0.2481 0.8661 0.9496 Eukaryote GH
CUPP 0.8005 0.3601 0.7392 0.8618 Eukaryote GH
dbCAN_2 0.8522 0.2834 0.7591 0.9454 All GT
dbCAN_2:HMMER 0.7357 0.3790 0.6176 0.8538 All GT
dbCAN_2:DIAMOND 0.8573 0.2802 0.7652 0.9494 All GT
dbCAN_2:Hotpep 0.8538 0.2711 0.7647 0.9429 All GT
dbCAN_3 0.9001 0.2062 0.8323 0.9679 All GT
dbCAN_3:HMMER 0.7834 0.3440 0.6762 0.8905 All GT
dbCAN_3:DIAMOND 0.9178 0.1912 0.8549 0.9806 All GT
dbCAN_3:eCAMI 0.8338 0.3071 0.7342 0.9333 All GT
dbCAN_4 0.8655 0.2772 0.7757 0.9554 All GT
dbCAN_4:HMMER 0.7651 0.3603 0.6543 0.8760 All GT
dbCAN_4:DIAMOND 0.9146 0.1907 0.8519 0.9773 All GT
dbCAN_4:dbCAN-sub 0.8610 0.2760 0.7715 0.9505 All GT
CUPP 0.8216 0.3331 0.7121 0.9311 All GT
dbCAN_2 0.8829 0.2216 0.8225 0.9434 Bacteria GT
dbCAN_2:HMMER 0.8581 0.2555 0.7884 0.9279 Bacteria GT
dbCAN_2:DIAMOND 0.8982 0.2108 0.8407 0.9558 Bacteria GT
dbCAN_2:Hotpep 0.7793 0.3028 0.6966 0.8619 Bacteria GT
dbCAN_3 0.8971 0.2235 0.8367 0.9575 Bacteria GT
dbCAN_3:HMMER 0.8546 0.2544 0.7859 0.9234 Bacteria GT
dbCAN_3:DIAMOND 0.9327 0.1752 0.8854 0.9801 Bacteria GT
dbCAN_3:eCAMI 0.7917 0.2856 0.7138 0.8697 Bacteria GT
dbCAN_4 0.8900 0.2459 0.8235 0.9565 Bacteria GT
dbCAN_4:HMMER 0.8545 0.2544 0.7858 0.9233 Bacteria GT
dbCAN_4:DIAMOND 0.9344 0.1731 0.8876 0.9812 Bacteria GT
dbCAN_4:dbCAN-sub 0.8900 0.2459 0.8236 0.9565 Bacteria GT
CUPP 0.8192 0.2627 0.7475 0.8910 Bacteria GT
dbCAN_2 0.8656 0.2610 0.8055 0.9256 Eukaryote GT
dbCAN_2:HMMER 0.8287 0.2984 0.7605 0.8969 Eukaryote GT
dbCAN_2:DIAMOND 0.8742 0.2586 0.8147 0.9337 Eukaryote GT
dbCAN_2:Hotpep 0.8003 0.3072 0.7296 0.8710 Eukaryote GT
dbCAN_3 0.9006 0.2230 0.8496 0.9515 Eukaryote GT
dbCAN_3:HMMER 0.8616 0.2504 0.8044 0.9188 Eukaryote GT
dbCAN_3:DIAMOND 0.9266 0.1899 0.8832 0.9700 Eukaryote GT
dbCAN_3:eCAMI 0.7985 0.3097 0.7277 0.8692 Eukaryote GT
dbCAN_4 0.8923 0.2395 0.8376 0.9471 Eukaryote GT
dbCAN_4:HMMER 0.8503 0.2674 0.7897 0.9110 Eukaryote GT
dbCAN_4:DIAMOND 0.9263 0.1884 0.8832 0.9693 Eukaryote GT
dbCAN_4:dbCAN-sub 0.8915 0.2396 0.8368 0.9463 Eukaryote GT
CUPP 0.8230 0.2883 0.7567 0.8893 Eukaryote GT
dbCAN_2 0.8269 0.3465 0.6733 0.9805 All PL
dbCAN_2:HMMER 0.8263 0.3505 0.6710 0.9817 All PL
dbCAN_2:DIAMOND 0.8476 0.3484 0.6931 1.0020 All PL
dbCAN_2:Hotpep 0.7400 0.3677 0.5810 0.8990 All PL
dbCAN_3 0.9237 0.2283 0.8225 1.0250 All PL
dbCAN_3:HMMER 0.9021 0.2333 0.7987 1.0055 All PL
dbCAN_3:DIAMOND 0.9465 0.1396 0.8846 1.0084 All PL
dbCAN_3:eCAMI 0.7021 0.4049 0.5226 0.8817 All PL
dbCAN_4 0.9021 0.2333 0.7987 1.0055 All PL
dbCAN_4:HMMER 0.9021 0.2333 0.7987 1.0055 All PL
dbCAN_4:DIAMOND 0.9465 0.1396 0.8846 1.0084 All PL
dbCAN_4:dbCAN-sub 0.9026 0.2335 0.7991 1.0061 All PL
CUPP 0.7198 0.4156 0.5355 0.9040 All PL
dbCAN_2 0.7313 0.4526 0.3529 1.1097 Bacteria PL
dbCAN_2:HMMER 0.7464 0.4608 0.3612 1.1317 Bacteria PL
dbCAN_2:DIAMOND 0.7155 0.4465 0.3422 1.0887 Bacteria PL
dbCAN_2:Hotpep 0.7174 0.4448 0.3455 1.0893 Bacteria PL
dbCAN_3 0.9714 0.0700 0.9129 1.0299 Bacteria PL
dbCAN_3:HMMER 0.9714 0.0700 0.9129 1.0299 Bacteria PL
dbCAN_3:DIAMOND 0.9342 0.1007 0.8500 1.0184 Bacteria PL
dbCAN_3:eCAMI 0.6960 0.4375 0.3303 1.0618 Bacteria PL
dbCAN_4 0.9714 0.0700 0.9129 1.0299 Bacteria PL
dbCAN_4:HMMER 0.9714 0.0700 0.9129 1.0299 Bacteria PL
dbCAN_4:DIAMOND 0.9402 0.1033 0.8538 1.0265 Bacteria PL
dbCAN_4:dbCAN-sub 0.9714 0.0700 0.9129 1.0299 Bacteria PL
CUPP 0.7410 0.4576 0.3584 1.1236 Bacteria PL
dbCAN_2 0.8319 0.3393 0.6852 0.9787 Eukaryote PL
dbCAN_2:HMMER 0.8337 0.3442 0.6848 0.9825 Eukaryote PL
dbCAN_2:DIAMOND 0.8484 0.3400 0.7014 0.9954 Eukaryote PL
dbCAN_2:Hotpep 0.7465 0.3610 0.5941 0.8990 Eukaryote PL
dbCAN_3 0.9342 0.2161 0.8408 1.0276 Eukaryote PL
dbCAN_3:HMMER 0.9135 0.2222 0.8174 1.0096 Eukaryote PL
dbCAN_3:DIAMOND 0.9477 0.1308 0.8911 1.0043 Eukaryote PL
dbCAN_3:eCAMI 0.7043 0.3957 0.5332 0.8754 Eukaryote PL
dbCAN_4 0.9135 0.2222 0.8174 1.0096 Eukaryote PL
dbCAN_4:HMMER 0.9135 0.2222 0.8174 1.0096 Eukaryote PL
dbCAN_4:DIAMOND 0.9489 0.1312 0.8922 1.0057 Eukaryote PL
dbCAN_4:dbCAN-sub 0.9139 0.2223 0.8177 1.0100 Eukaryote PL
CUPP 0.7307 0.4094 0.5537 0.9077 Eukaryote PL
dbCAN_2 0.7938 0.3298 0.6111 0.9764 All CE
dbCAN_2:HMMER 0.7240 0.3891 0.5167 0.9314 All CE
dbCAN_2:DIAMOND 0.7963 0.3095 0.6249 0.9677 All CE
dbCAN_2:Hotpep 0.7669 0.3249 0.5870 0.9469 All CE
dbCAN_3 0.7987 0.3295 0.6163 0.9812 All CE
dbCAN_3:HMMER 0.7723 0.3497 0.5786 0.9660 All CE
dbCAN_3:DIAMOND 0.8446 0.2718 0.6941 0.9951 All CE
dbCAN_3:eCAMI 0.7529 0.3236 0.5737 0.9321 All CE
dbCAN_4 0.8515 0.2802 0.6963 1.0066 All CE
dbCAN_4:HMMER 0.8435 0.2780 0.6895 0.9974 All CE
dbCAN_4:DIAMOND 0.9062 0.1399 0.8287 0.9837 All CE
dbCAN_4:dbCAN-sub 0.8515 0.2802 0.6963 1.0066 All CE
CUPP 0.7259 0.4042 0.5020 0.9497 All CE
dbCAN_2 0.6131 0.4274 0.3764 0.8497 Bacteria CE
dbCAN_2:HMMER 0.5665 0.4757 0.3131 0.8200 Bacteria CE
dbCAN_2:DIAMOND 0.6139 0.4167 0.3733 0.8544 Bacteria CE
dbCAN_2:Hotpep 0.5720 0.3512 0.3692 0.7747 Bacteria CE
dbCAN_3 0.7983 0.3111 0.6103 0.9862 Bacteria CE
dbCAN_3:HMMER 0.6710 0.4453 0.4244 0.9176 Bacteria CE
dbCAN_3:DIAMOND 0.7870 0.2998 0.6058 0.9682 Bacteria CE
dbCAN_3:eCAMI 0.5909 0.3773 0.3629 0.8189 Bacteria CE
dbCAN_4 0.7453 0.4141 0.5160 0.9746 Bacteria CE
dbCAN_4:HMMER 0.7376 0.4112 0.5099 0.9654 Bacteria CE
dbCAN_4:DIAMOND 0.9092 0.1817 0.7994 1.0190 Bacteria CE
dbCAN_4:dbCAN-sub 0.7454 0.4141 0.5160 0.9747 Bacteria CE
CUPP 0.7054 0.4327 0.4439 0.9669 Bacteria CE
dbCAN_2 0.6948 0.3766 0.5075 0.8820 Eukaryote CE
dbCAN_2:HMMER 0.6399 0.4171 0.4388 0.8409 Eukaryote CE
dbCAN_2:DIAMOND 0.6906 0.3652 0.5090 0.8722 Eukaryote CE
dbCAN_2:Hotpep 0.6801 0.3221 0.5145 0.8458 Eukaryote CE
dbCAN_3 0.8134 0.2992 0.6595 0.9672 Eukaryote CE
dbCAN_3:HMMER 0.7310 0.3672 0.5483 0.9136 Eukaryote CE
dbCAN_3:DIAMOND 0.8248 0.2636 0.6892 0.9603 Eukaryote CE
dbCAN_3:eCAMI 0.6668 0.3325 0.4959 0.8378 Eukaryote CE
dbCAN_4 0.8015 0.3232 0.6408 0.9622 Eukaryote CE
dbCAN_4:HMMER 0.7905 0.3222 0.6303 0.9508 Eukaryote CE
dbCAN_4:DIAMOND 0.9041 0.1502 0.8269 0.9814 Eukaryote CE
dbCAN_4:dbCAN-sub 0.8015 0.3232 0.6408 0.9622 Eukaryote CE
CUPP 0.6880 0.4173 0.4734 0.9025 Eukaryote CE
dbCAN_2 1.0000 NA NaN NaN All AA
dbCAN_2:HMMER 1.0000 NA NaN NaN All AA
dbCAN_2:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_2:Hotpep 0.9677 NA NaN NaN All AA
dbCAN_3 1.0000 NA NaN NaN All AA
dbCAN_3:HMMER 1.0000 NA NaN NaN All AA
dbCAN_3:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_3:eCAMI 0.9677 NA NaN NaN All AA
dbCAN_4 1.0000 NA NaN NaN All AA
dbCAN_4:HMMER 1.0000 NA NaN NaN All AA
dbCAN_4:DIAMOND 0.9677 NA NaN NaN All AA
dbCAN_4:dbCAN-sub 1.0000 NA NaN NaN All AA
CUPP 1.0000 NA NaN NaN All AA
dbCAN_2 0.7567 0.3256 0.5764 0.9371 Bacteria AA
dbCAN_2:HMMER 0.8234 0.2645 0.6769 0.9699 Bacteria AA
dbCAN_2:DIAMOND 0.7800 0.2501 0.6415 0.9185 Bacteria AA
dbCAN_2:Hotpep 0.6892 0.3773 0.4803 0.8981 Bacteria AA
dbCAN_3 0.8671 0.2500 0.7286 1.0055 Bacteria AA
dbCAN_3:HMMER 0.8482 0.2552 0.7069 0.9896 Bacteria AA
dbCAN_3:DIAMOND 0.8875 0.1295 0.8158 0.9593 Bacteria AA
dbCAN_3:eCAMI 0.6251 0.3994 0.4039 0.8462 Bacteria AA
dbCAN_4 0.8534 0.2544 0.7126 0.9943 Bacteria AA
dbCAN_4:HMMER 0.8482 0.2552 0.7069 0.9896 Bacteria AA
dbCAN_4:DIAMOND 0.8925 0.1406 0.8147 0.9704 Bacteria AA
dbCAN_4:dbCAN-sub 0.8506 0.2547 0.7096 0.9917 Bacteria AA
CUPP 0.6464 0.4377 0.4040 0.8887 Bacteria AA
dbCAN_2 0.7720 0.3204 0.6012 0.9427 Eukaryote AA
dbCAN_2:HMMER 0.8344 0.2593 0.6963 0.9726 Eukaryote AA
dbCAN_2:DIAMOND 0.7937 0.2478 0.6617 0.9258 Eukaryote AA
dbCAN_2:Hotpep 0.7066 0.3711 0.5089 0.9043 Eukaryote AA
dbCAN_3 0.8754 0.2438 0.7455 1.0053 Eukaryote AA
dbCAN_3:HMMER 0.8577 0.2494 0.7248 0.9906 Eukaryote AA
dbCAN_3:DIAMOND 0.8945 0.1283 0.8262 0.9629 Eukaryote AA
dbCAN_3:eCAMI 0.6465 0.3952 0.4359 0.8571 Eukaryote AA
dbCAN_4 0.8626 0.2485 0.7302 0.9950 Eukaryote AA
dbCAN_4:HMMER 0.8577 0.2494 0.7248 0.9906 Eukaryote AA
dbCAN_4:DIAMOND 0.8972 0.1371 0.8242 0.9703 Eukaryote AA
dbCAN_4:dbCAN-sub 0.8600 0.2489 0.7273 0.9926 Eukaryote AA
CUPP 0.6685 0.4320 0.4383 0.8987 Eukaryote AA
dbCAN_2 0.7987 0.3338 0.6996 0.8978 All CBM
dbCAN_2:HMMER 0.7286 0.3965 0.6121 0.8450 All CBM
dbCAN_2:DIAMOND 0.8019 0.3200 0.7068 0.8969 All CBM
dbCAN_2:Hotpep 0.3038 0.3136 0.2235 0.3841 All CBM
dbCAN_3 0.8201 0.3271 0.7229 0.9172 All CBM
dbCAN_3:HMMER 0.7286 0.3965 0.6121 0.8450 All CBM
dbCAN_3:DIAMOND 0.8352 0.3182 0.7407 0.9297 All CBM
dbCAN_3:eCAMI 0.4546 0.3598 0.3582 0.5510 All CBM
dbCAN_4 0.8130 0.3508 0.7101 0.9160 All CBM
dbCAN_4:HMMER 0.7499 0.3832 0.6373 0.8624 All CBM
dbCAN_4:DIAMOND 0.8356 0.3190 0.7409 0.9304 All CBM
dbCAN_4:dbCAN-sub 0.8128 0.3510 0.7098 0.9159 All CBM
CUPP 0.0000 0.0000 0.0000 0.0000 All CBM
dbCAN_2 0.7083 0.3867 0.5486 0.8679 Bacteria CBM
dbCAN_2:HMMER 0.5542 0.4569 0.3656 0.7428 Bacteria CBM
dbCAN_2:DIAMOND 0.7580 0.3136 0.6285 0.8874 Bacteria CBM
dbCAN_2:Hotpep 0.2405 0.3326 0.1428 0.3381 Bacteria CBM
dbCAN_3 0.7969 0.3364 0.6581 0.9358 Bacteria CBM
dbCAN_3:HMMER 0.5942 0.4500 0.4084 0.7800 Bacteria CBM
dbCAN_3:DIAMOND 0.9000 0.2139 0.8097 0.9903 Bacteria CBM
dbCAN_3:eCAMI 0.4066 0.4019 0.2707 0.5426 Bacteria CBM
dbCAN_4 0.8457 0.2728 0.7331 0.9583 Bacteria CBM
dbCAN_4:HMMER 0.6262 0.4342 0.4470 0.8054 Bacteria CBM
dbCAN_4:DIAMOND 0.9323 0.1562 0.8663 0.9983 Bacteria CBM
dbCAN_4:dbCAN-sub 0.7803 0.3240 0.6465 0.9140 Bacteria CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Bacteria CBM
dbCAN_2 0.7911 0.3255 0.7031 0.8791 Eukaryote CBM
dbCAN_2:HMMER 0.7000 0.4068 0.5911 0.8090 Eukaryote CBM
dbCAN_2:DIAMOND 0.7972 0.3121 0.7128 0.8816 Eukaryote CBM
dbCAN_2:Hotpep 0.3416 0.3155 0.2658 0.4174 Eukaryote CBM
dbCAN_3 0.8334 0.3055 0.7508 0.9159 Eukaryote CBM
dbCAN_3:HMMER 0.7179 0.3973 0.6115 0.8243 Eukaryote CBM
dbCAN_3:DIAMOND 0.8551 0.2931 0.7758 0.9343 Eukaryote CBM
dbCAN_3:eCAMI 0.4658 0.3454 0.3803 0.5514 Eukaryote CBM
dbCAN_4 0.8144 0.3410 0.7231 0.9057 Eukaryote CBM
dbCAN_4:HMMER 0.7349 0.3862 0.6314 0.8383 Eukaryote CBM
dbCAN_4:DIAMOND 0.8581 0.2917 0.7793 0.9370 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.7832 0.3592 0.6870 0.8794 Eukaryote CBM
CUPP 0.0000 0.0000 0.0000 0.0000 Eukaryote CBM

11.2.5 Accuracy

Table 11.10: Overall performance (represented by the Accuracy) of CAZy class classification by CAZy classifiers per taxonomy group
Prediction_tool Mean Standard Deviation LowerCI UpperCI Tax_group CAZy_class
dbCAN_2 0.9996 0.0013 0.9993 0.9998 All GH
dbCAN_2:HMMER 0.9996 0.0013 0.9994 0.9998 All GH
dbCAN_2:DIAMOND 0.9996 0.0012 0.9994 0.9998 All GH
dbCAN_2:Hotpep 0.9993 0.0019 0.9989 0.9996 All GH
dbCAN_3 0.9997 0.0009 0.9996 0.9999 All GH
dbCAN_3:HMMER 0.9996 0.0013 0.9994 0.9998 All GH
dbCAN_3:DIAMOND 0.9997 0.0010 0.9995 0.9999 All GH
dbCAN_3:eCAMI 0.9994 0.0014 0.9991 0.9996 All GH
dbCAN_4 0.9997 0.0009 0.9996 0.9999 All GH
dbCAN_4:HMMER 0.9996 0.0013 0.9994 0.9998 All GH
dbCAN_4:DIAMOND 0.9997 0.0009 0.9996 0.9999 All GH
dbCAN_4:dbCAN-sub 0.9997 0.0009 0.9996 0.9999 All GH
CUPP 0.9996 0.0011 0.9994 0.9998 All GH
dbCAN_2 0.9994 0.0018 0.9990 0.9998 Bacteria GH
dbCAN_2:HMMER 0.9993 0.0024 0.9988 0.9998 Bacteria GH
dbCAN_2:DIAMOND 0.9996 0.0008 0.9994 0.9998 Bacteria GH
dbCAN_2:Hotpep 0.9990 0.0021 0.9986 0.9995 Bacteria GH
dbCAN_3 0.9997 0.0011 0.9994 0.9999 Bacteria GH
dbCAN_3:HMMER 0.9994 0.0023 0.9989 0.9999 Bacteria GH
dbCAN_3:DIAMOND 0.9998 0.0004 0.9997 0.9999 Bacteria GH
dbCAN_3:eCAMI 0.9991 0.0015 0.9988 0.9994 Bacteria GH
dbCAN_4 0.9996 0.0022 0.9991 1.0001 Bacteria GH
dbCAN_4:HMMER 0.9994 0.0023 0.9989 0.9999 Bacteria GH
dbCAN_4:DIAMOND 0.9998 0.0004 0.9997 0.9999 Bacteria GH
dbCAN_4:dbCAN-sub 0.9996 0.0022 0.9991 1.0001 Bacteria GH
CUPP 0.9994 0.0023 0.9989 0.9998 Bacteria GH
dbCAN_2 0.9996 0.0013 0.9994 0.9998 Eukaryote GH
dbCAN_2:HMMER 0.9996 0.0014 0.9994 0.9998 Eukaryote GH
dbCAN_2:DIAMOND 0.9997 0.0008 0.9996 0.9998 Eukaryote GH
dbCAN_2:Hotpep 0.9994 0.0016 0.9991 0.9996 Eukaryote GH
dbCAN_3 0.9998 0.0008 0.9996 0.9999 Eukaryote GH
dbCAN_3:HMMER 0.9996 0.0014 0.9994 0.9999 Eukaryote GH
dbCAN_3:DIAMOND 0.9998 0.0005 0.9997 0.9999 Eukaryote GH
dbCAN_3:eCAMI 0.9994 0.0012 0.9992 0.9996 Eukaryote GH
dbCAN_4 0.9998 0.0013 0.9995 1.0000 Eukaryote GH
dbCAN_4:HMMER 0.9996 0.0014 0.9994 0.9999 Eukaryote GH
dbCAN_4:DIAMOND 0.9998 0.0005 0.9997 0.9999 Eukaryote GH
dbCAN_4:dbCAN-sub 0.9997 0.0013 0.9995 1.0000 Eukaryote GH
CUPP 0.9996 0.0013 0.9994 0.9998 Eukaryote GH
dbCAN_2 0.9987 0.0034 0.9976 0.9998 All GT
dbCAN_2:HMMER 0.9985 0.0038 0.9973 0.9996 All GT
dbCAN_2:DIAMOND 0.9990 0.0021 0.9983 0.9997 All GT
dbCAN_2:Hotpep 0.9972 0.0096 0.9940 1.0003 All GT
dbCAN_3 0.9991 0.0021 0.9984 0.9998 All GT
dbCAN_3:HMMER 0.9985 0.0038 0.9973 0.9997 All GT
dbCAN_3:DIAMOND 0.9994 0.0011 0.9990 0.9997 All GT
dbCAN_3:eCAMI 0.9986 0.0038 0.9973 0.9998 All GT
dbCAN_4 0.9991 0.0022 0.9984 0.9998 All GT
dbCAN_4:HMMER 0.9985 0.0038 0.9974 0.9997 All GT
dbCAN_4:DIAMOND 0.9993 0.0013 0.9989 0.9997 All GT
dbCAN_4:dbCAN-sub 0.9991 0.0022 0.9984 0.9998 All GT
CUPP 0.9988 0.0031 0.9978 0.9998 All GT
dbCAN_2 0.9990 0.0019 0.9985 0.9995 Bacteria GT
dbCAN_2:HMMER 0.9989 0.0023 0.9983 0.9995 Bacteria GT
dbCAN_2:DIAMOND 0.9993 0.0015 0.9989 0.9997 Bacteria GT
dbCAN_2:Hotpep 0.9982 0.0033 0.9973 0.9991 Bacteria GT
dbCAN_3 0.9994 0.0012 0.9991 0.9997 Bacteria GT
dbCAN_3:HMMER 0.9989 0.0023 0.9983 0.9995 Bacteria GT
dbCAN_3:DIAMOND 0.9996 0.0007 0.9994 0.9998 Bacteria GT
dbCAN_3:eCAMI 0.9987 0.0019 0.9981 0.9992 Bacteria GT
dbCAN_4 0.9994 0.0015 0.9990 0.9998 Bacteria GT
dbCAN_4:HMMER 0.9989 0.0023 0.9983 0.9995 Bacteria GT
dbCAN_4:DIAMOND 0.9997 0.0007 0.9995 0.9998 Bacteria GT
dbCAN_4:dbCAN-sub 0.9994 0.0015 0.9990 0.9998 Bacteria GT
CUPP 0.9987 0.0023 0.9981 0.9993 Bacteria GT
dbCAN_2 0.9993 0.0017 0.9989 0.9997 Eukaryote GT
dbCAN_2:HMMER 0.9992 0.0021 0.9987 0.9997 Eukaryote GT
dbCAN_2:DIAMOND 0.9995 0.0011 0.9992 0.9997 Eukaryote GT
dbCAN_2:Hotpep 0.9986 0.0045 0.9976 0.9997 Eukaryote GT
dbCAN_3 0.9996 0.0011 0.9993 0.9998 Eukaryote GT
dbCAN_3:HMMER 0.9992 0.0021 0.9987 0.9997 Eukaryote GT
dbCAN_3:DIAMOND 0.9997 0.0006 0.9996 0.9999 Eukaryote GT
dbCAN_3:eCAMI 0.9992 0.0018 0.9987 0.9996 Eukaryote GT
dbCAN_4 0.9996 0.0013 0.9993 0.9999 Eukaryote GT
dbCAN_4:HMMER 0.9992 0.0021 0.9987 0.9997 Eukaryote GT
dbCAN_4:DIAMOND 0.9997 0.0007 0.9995 0.9999 Eukaryote GT
dbCAN_4:dbCAN-sub 0.9996 0.0013 0.9992 0.9999 Eukaryote GT
CUPP 0.9992 0.0018 0.9988 0.9996 Eukaryote GT
dbCAN_2 0.9998 0.0004 0.9996 1.0000 All PL
dbCAN_2:HMMER 0.9998 0.0004 0.9997 1.0000 All PL
dbCAN_2:DIAMOND 0.9999 0.0004 0.9997 1.0000 All PL
dbCAN_2:Hotpep 0.9997 0.0004 0.9996 0.9999 All PL
dbCAN_3 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_3:HMMER 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_3:DIAMOND 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_3:eCAMI 0.9996 0.0005 0.9994 0.9998 All PL
dbCAN_4 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_4:HMMER 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_4:DIAMOND 0.9999 0.0001 0.9999 1.0000 All PL
dbCAN_4:dbCAN-sub 0.9999 0.0001 0.9999 1.0000 All PL
CUPP 0.9998 0.0004 0.9996 0.9999 All PL
dbCAN_2 0.9998 0.0003 0.9995 1.0000 Bacteria PL
dbCAN_2:HMMER 0.9998 0.0003 0.9996 1.0001 Bacteria PL
dbCAN_2:DIAMOND 0.9997 0.0003 0.9995 1.0000 Bacteria PL
dbCAN_2:Hotpep 0.9997 0.0003 0.9995 1.0000 Bacteria PL
dbCAN_3 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_3:HMMER 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_3:DIAMOND 0.9999 0.0002 0.9997 1.0000 Bacteria PL
dbCAN_3:eCAMI 0.9997 0.0003 0.9994 0.9999 Bacteria PL
dbCAN_4 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_4:HMMER 0.9999 0.0001 0.9998 1.0000 Bacteria PL
dbCAN_4:DIAMOND 0.9999 0.0002 0.9997 1.0000 Bacteria PL
dbCAN_4:dbCAN-sub 0.9999 0.0001 0.9998 1.0000 Bacteria PL
CUPP 0.9998 0.0003 0.9996 1.0000 Bacteria PL
dbCAN_2 0.9999 0.0003 0.9998 1.0000 Eukaryote PL
dbCAN_2:HMMER 0.9999 0.0002 0.9998 1.0000 Eukaryote PL
dbCAN_2:DIAMOND 0.9999 0.0002 0.9998 1.0000 Eukaryote PL
dbCAN_2:Hotpep 0.9998 0.0003 0.9997 0.9999 Eukaryote PL
dbCAN_3 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_3:HMMER 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_3:DIAMOND 0.9999 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_3:eCAMI 0.9998 0.0003 0.9996 0.9999 Eukaryote PL
dbCAN_4 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_4:HMMER 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_4:DIAMOND 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
dbCAN_4:dbCAN-sub 1.0000 0.0001 0.9999 1.0000 Eukaryote PL
CUPP 0.9999 0.0003 0.9997 1.0000 Eukaryote PL
dbCAN_2 0.9985 0.0026 0.9970 1.0000 All CE
dbCAN_2:HMMER 0.9985 0.0022 0.9974 0.9997 All CE
dbCAN_2:DIAMOND 0.9984 0.0025 0.9970 0.9998 All CE
dbCAN_2:Hotpep 0.9981 0.0027 0.9966 0.9996 All CE
dbCAN_3 0.9985 0.0027 0.9970 1.0000 All CE
dbCAN_3:HMMER 0.9985 0.0023 0.9972 0.9998 All CE
dbCAN_3:DIAMOND 0.9989 0.0018 0.9979 0.9999 All CE
dbCAN_3:eCAMI 0.9981 0.0027 0.9966 0.9996 All CE
dbCAN_4 0.9989 0.0021 0.9977 1.0001 All CE
dbCAN_4:HMMER 0.9987 0.0023 0.9974 1.0000 All CE
dbCAN_4:DIAMOND 0.9992 0.0017 0.9982 1.0001 All CE
dbCAN_4:dbCAN-sub 0.9989 0.0021 0.9977 1.0001 All CE
CUPP 0.9986 0.0021 0.9974 0.9998 All CE
dbCAN_2 0.9995 0.0006 0.9991 0.9998 Bacteria CE
dbCAN_2:HMMER 0.9994 0.0008 0.9990 0.9999 Bacteria CE
dbCAN_2:DIAMOND 0.9993 0.0007 0.9988 0.9997 Bacteria CE
dbCAN_2:Hotpep 0.9990 0.0009 0.9984 0.9995 Bacteria CE
dbCAN_3 0.9995 0.0007 0.9991 0.9999 Bacteria CE
dbCAN_3:HMMER 0.9994 0.0008 0.9990 0.9999 Bacteria CE
dbCAN_3:DIAMOND 0.9993 0.0009 0.9987 0.9998 Bacteria CE
dbCAN_3:eCAMI 0.9990 0.0010 0.9983 0.9996 Bacteria CE
dbCAN_4 0.9995 0.0008 0.9991 1.0000 Bacteria CE
dbCAN_4:HMMER 0.9995 0.0008 0.9990 0.9999 Bacteria CE
dbCAN_4:DIAMOND 0.9995 0.0008 0.9990 1.0000 Bacteria CE
dbCAN_4:dbCAN-sub 0.9995 0.0008 0.9991 1.0000 Bacteria CE
CUPP 0.9994 0.0009 0.9989 1.0000 Bacteria CE
dbCAN_2 0.9992 0.0014 0.9984 0.9999 Eukaryote CE
dbCAN_2:HMMER 0.9991 0.0013 0.9985 0.9998 Eukaryote CE
dbCAN_2:DIAMOND 0.9990 0.0014 0.9984 0.9997 Eukaryote CE
dbCAN_2:Hotpep 0.9987 0.0015 0.9980 0.9995 Eukaryote CE
dbCAN_3 0.9991 0.0016 0.9983 0.9999 Eukaryote CE
dbCAN_3:HMMER 0.9991 0.0013 0.9985 0.9998 Eukaryote CE
dbCAN_3:DIAMOND 0.9992 0.0010 0.9987 0.9998 Eukaryote CE
dbCAN_3:eCAMI 0.9987 0.0015 0.9980 0.9995 Eukaryote CE
dbCAN_4 0.9993 0.0012 0.9988 0.9999 Eukaryote CE
dbCAN_4:HMMER 0.9992 0.0013 0.9986 0.9999 Eukaryote CE
dbCAN_4:DIAMOND 0.9994 0.0009 0.9990 0.9999 Eukaryote CE
dbCAN_4:dbCAN-sub 0.9993 0.0012 0.9988 0.9999 Eukaryote CE
CUPP 0.9992 0.0012 0.9985 0.9998 Eukaryote CE
dbCAN_2 1.0000 NA NaN NaN All AA
dbCAN_2:HMMER 1.0000 NA NaN NaN All AA
dbCAN_2:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_2:Hotpep 0.9998 NA NaN NaN All AA
dbCAN_3 1.0000 NA NaN NaN All AA
dbCAN_3:HMMER 1.0000 NA NaN NaN All AA
dbCAN_3:DIAMOND 1.0000 NA NaN NaN All AA
dbCAN_3:eCAMI 0.9998 NA NaN NaN All AA
dbCAN_4 1.0000 NA NaN NaN All AA
dbCAN_4:HMMER 1.0000 NA NaN NaN All AA
dbCAN_4:DIAMOND 0.9998 NA NaN NaN All AA
dbCAN_4:dbCAN-sub 1.0000 NA NaN NaN All AA
CUPP 1.0000 NA NaN NaN All AA
dbCAN_2 0.9987 0.0020 0.9976 0.9998 Bacteria AA
dbCAN_2:HMMER 0.9987 0.0022 0.9975 0.9999 Bacteria AA
dbCAN_2:DIAMOND 0.9987 0.0017 0.9977 0.9996 Bacteria AA
dbCAN_2:Hotpep 0.9985 0.0019 0.9975 0.9996 Bacteria AA
dbCAN_3 0.9991 0.0015 0.9983 0.9999 Bacteria AA
dbCAN_3:HMMER 0.9991 0.0014 0.9983 0.9999 Bacteria AA
dbCAN_3:DIAMOND 0.9990 0.0014 0.9983 0.9998 Bacteria AA
dbCAN_3:eCAMI 0.9983 0.0020 0.9972 0.9994 Bacteria AA
dbCAN_4 0.9991 0.0013 0.9984 0.9998 Bacteria AA
dbCAN_4:HMMER 0.9991 0.0014 0.9983 0.9999 Bacteria AA
dbCAN_4:DIAMOND 0.9990 0.0013 0.9983 0.9998 Bacteria AA
dbCAN_4:dbCAN-sub 0.9991 0.0014 0.9983 0.9998 Bacteria AA
CUPP 0.9986 0.0017 0.9977 0.9996 Bacteria AA
dbCAN_2 0.9994 0.0010 0.9989 0.9999 Eukaryote AA
dbCAN_2:HMMER 0.9994 0.0011 0.9988 1.0000 Eukaryote AA
dbCAN_2:DIAMOND 0.9994 0.0009 0.9989 0.9998 Eukaryote AA
dbCAN_2:Hotpep 0.9993 0.0009 0.9988 0.9998 Eukaryote AA
dbCAN_3 0.9996 0.0007 0.9992 1.0000 Eukaryote AA
dbCAN_3:HMMER 0.9996 0.0007 0.9992 0.9999 Eukaryote AA
dbCAN_3:DIAMOND 0.9995 0.0007 0.9992 0.9999 Eukaryote AA
dbCAN_3:eCAMI 0.9992 0.0010 0.9987 0.9997 Eukaryote AA
dbCAN_4 0.9996 0.0007 0.9992 0.9999 Eukaryote AA
dbCAN_4:HMMER 0.9996 0.0007 0.9992 0.9999 Eukaryote AA
dbCAN_4:DIAMOND 0.9995 0.0006 0.9992 0.9999 Eukaryote AA
dbCAN_4:dbCAN-sub 0.9996 0.0007 0.9992 0.9999 Eukaryote AA
CUPP 0.9993 0.0009 0.9989 0.9998 Eukaryote AA
dbCAN_2 0.9991 0.0022 0.9985 0.9998 All CBM
dbCAN_2:HMMER 0.9985 0.0044 0.9972 0.9998 All CBM
dbCAN_2:DIAMOND 0.9993 0.0017 0.9988 0.9998 All CBM
dbCAN_2:Hotpep 0.9958 0.0061 0.9942 0.9973 All CBM
dbCAN_3 0.9993 0.0018 0.9988 0.9998 All CBM
dbCAN_3:HMMER 0.9985 0.0044 0.9972 0.9998 All CBM
dbCAN_3:DIAMOND 0.9994 0.0016 0.9990 0.9999 All CBM
dbCAN_3:eCAMI 0.9976 0.0037 0.9967 0.9986 All CBM
dbCAN_4 0.9995 0.0011 0.9992 0.9999 All CBM
dbCAN_4:HMMER 0.9987 0.0042 0.9975 0.9999 All CBM
dbCAN_4:DIAMOND 0.9994 0.0016 0.9989 0.9999 All CBM
dbCAN_4:dbCAN-sub 0.9996 0.0008 0.9994 0.9998 All CBM
CUPP 0.9968 0.0054 0.9952 0.9984 All CBM
dbCAN_2 0.9985 0.0023 0.9975 0.9994 Bacteria CBM
dbCAN_2:HMMER 0.9973 0.0047 0.9954 0.9993 Bacteria CBM
dbCAN_2:DIAMOND 0.9987 0.0017 0.9981 0.9994 Bacteria CBM
dbCAN_2:Hotpep 0.9968 0.0050 0.9953 0.9983 Bacteria CBM
dbCAN_3 0.9989 0.0018 0.9981 0.9996 Bacteria CBM
dbCAN_3:HMMER 0.9973 0.0047 0.9954 0.9993 Bacteria CBM
dbCAN_3:DIAMOND 0.9994 0.0009 0.9990 0.9998 Bacteria CBM
dbCAN_3:eCAMI 0.9980 0.0033 0.9969 0.9992 Bacteria CBM
dbCAN_4 0.9986 0.0026 0.9976 0.9997 Bacteria CBM
dbCAN_4:HMMER 0.9974 0.0047 0.9955 0.9994 Bacteria CBM
dbCAN_4:DIAMOND 0.9996 0.0006 0.9993 0.9998 Bacteria CBM
dbCAN_4:dbCAN-sub 0.9985 0.0027 0.9974 0.9996 Bacteria CBM
CUPP 0.9961 0.0049 0.9941 0.9982 Bacteria CBM
dbCAN_2 0.9993 0.0017 0.9988 0.9997 Eukaryote CBM
dbCAN_2:HMMER 0.9988 0.0031 0.9980 0.9996 Eukaryote CBM
dbCAN_2:DIAMOND 0.9994 0.0013 0.9991 0.9998 Eukaryote CBM
dbCAN_2:Hotpep 0.9970 0.0044 0.9960 0.9981 Eukaryote CBM
dbCAN_3 0.9994 0.0013 0.9991 0.9998 Eukaryote CBM
dbCAN_3:HMMER 0.9988 0.0031 0.9980 0.9996 Eukaryote CBM
dbCAN_3:DIAMOND 0.9996 0.0010 0.9994 0.9999 Eukaryote CBM
dbCAN_3:eCAMI 0.9984 0.0023 0.9979 0.9990 Eukaryote CBM
dbCAN_4 0.9995 0.0012 0.9992 0.9998 Eukaryote CBM
dbCAN_4:HMMER 0.9989 0.0030 0.9981 0.9997 Eukaryote CBM
dbCAN_4:DIAMOND 0.9997 0.0009 0.9994 0.9999 Eukaryote CBM
dbCAN_4:dbCAN-sub 0.9995 0.0011 0.9992 0.9998 Eukaryote CBM
CUPP 0.9978 0.0037 0.9968 0.9988 Eukaryote CBM

12 CAZy family multilabel classification tax performance

Table 12.1: Rand Index of CAZyme famifier classificiation of CAZy fam annotations
Prediction_tool Lower CI Mean Upper CI Standard Deviation Tax_group
dbCAN_2 0.9996 0.9996 0.9997 0.0013 All
dbCAN_2:HMMER 0.9995 0.9995 0.9996 0.0014 All
dbCAN_2:DIAMOND 0.9997 0.9997 0.9997 0.0012 All
dbCAN_2:Hotpep 0.9990 0.9991 0.9991 0.0023 All
dbCAN_3 0.9997 0.9998 0.9998 0.0011 All
dbCAN_3:HMMER 0.9995 0.9996 0.9996 0.0014 All
dbCAN_3:DIAMOND 0.9998 0.9998 0.9998 0.0010 All
dbCAN_3:eCAMI 0.9994 0.9994 0.9994 0.0017 All
dbCAN_4 0.9997 0.9998 0.9998 0.0011 All
dbCAN_4:HMMER 0.9996 0.9996 0.9996 0.0014 All
dbCAN_4:DIAMOND 0.9998 0.9998 0.9998 0.0009 All
dbCAN_4:dbCAN-sub 0.9997 0.9998 0.9998 0.0011 All
CUPP 0.9994 0.9994 0.9995 0.0015 All
dbCAN_2 0.9996 0.9996 0.9997 0.0013 Bacteria
dbCAN_2:HMMER 0.9995 0.9996 0.9996 0.0014 Bacteria
dbCAN_2:DIAMOND 0.9997 0.9997 0.9997 0.0012 Bacteria
dbCAN_2:Hotpep 0.9989 0.9990 0.9990 0.0026 Bacteria
dbCAN_3 0.9997 0.9997 0.9998 0.0011 Bacteria
dbCAN_3:HMMER 0.9995 0.9996 0.9996 0.0014 Bacteria
dbCAN_3:DIAMOND 0.9998 0.9998 0.9998 0.0010 Bacteria
dbCAN_3:eCAMI 0.9993 0.9994 0.9994 0.0019 Bacteria
dbCAN_4 0.9998 0.9998 0.9998 0.0011 Bacteria
dbCAN_4:HMMER 0.9996 0.9996 0.9996 0.0013 Bacteria
dbCAN_4:DIAMOND 0.9998 0.9998 0.9998 0.0010 Bacteria
dbCAN_4:dbCAN-sub 0.9998 0.9998 0.9998 0.0011 Bacteria
CUPP 0.9994 0.9994 0.9995 0.0016 Bacteria
dbCAN_2 0.9996 0.9996 0.9997 0.0013 Eukaryote
dbCAN_2:HMMER 0.9995 0.9995 0.9996 0.0014 Eukaryote
dbCAN_2:DIAMOND 0.9997 0.9997 0.9997 0.0011 Eukaryote
dbCAN_2:Hotpep 0.9992 0.9992 0.9993 0.0019 Eukaryote
dbCAN_3 0.9997 0.9998 0.9998 0.0010 Eukaryote
dbCAN_3:HMMER 0.9995 0.9996 0.9996 0.0014 Eukaryote
dbCAN_3:DIAMOND 0.9998 0.9998 0.9999 0.0009 Eukaryote
dbCAN_3:eCAMI 0.9994 0.9994 0.9995 0.0015 Eukaryote
dbCAN_4 0.9997 0.9997 0.9998 0.0011 Eukaryote
dbCAN_4:HMMER 0.9995 0.9996 0.9996 0.0014 Eukaryote
dbCAN_4:DIAMOND 0.9998 0.9999 0.9999 0.0008 Eukaryote
dbCAN_4:dbCAN-sub 0.9997 0.9997 0.9997 0.0011 Eukaryote
CUPP 0.9994 0.9995 0.9995 0.0014 Eukaryote