run_pbjelly - a faster pbjelly¶
run_pbjelly is a toolkit to do the gapcloser with PacBio or ONT long reads. Basically, it is the same as the published PBJelly. The main difference between them is that run_pbjelly firstly filtered out reads that are determined to have little relationship with the gap region by MECAT/Minimap2. It has been reported that MECAT/Minimap2 is 10 more times faster than Blasr which is employed by the PBJelly.
Another bright spot in our pipeline is that you can just finish all the tasks by running a single job whereas you have to manually deliver the task each time and you also need to check that each task has been completed with the original PBJelly pipeline. And we also replaced the blasr1.3 with blarsr5.3 in PBJelly.
对于runjelly, 采取Minimap2替换的版本, 可以下其中文文档 Chinese document
Performance of run_pbjelly on ~600M genome comparing to the original PBJelly as the tables,run_pbjelly achieve similar or better results but ~10 times faster !!! Because in this case, only ~93% of the reads were first filtered by mecat.
| Tool | Contig N50 (kbp) | Cpu hour |
|---|---|---|
| original scaffolds | 2,882 | NA |
| PBJelly | 3,363 | 1,112 (blasr alignment) |
| mecat | 2,912 | 46 |
| run_pbjelly | 3,494 | 46 (mecat alignment) + 92 (blasr alignment) |
| runjelly(zhouyiqi) | 3,467 | 3 (minimap2 alignment) + 150 (blasr alignment) |
Another tested on also ~731M genome with 100x PacBio data based on the same 20 cpu:
| Tool | Contig N50 (kbp) | Cpu hour |
|---|---|---|
| original scaffolds | 598 | NA |
| PBJelly | 796 | 5,367 (blasr alignment) |
| run_pbjelly | 794 | 92 (mecat alignment) + 858 (blasr alignment) |
| runpbjelly(zhouyiqi) | 801 | 29 (mecat alignment) + 452 (blasr alignment) |
Getting Started
Examples
Release Notes
Contact
Reference