Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job separation by chromosome #63

Merged

Conversation

hyunhwan-bcm
Copy link
Contributor

@hyunhwan-bcm hyunhwan-bcm commented Aug 19, 2024

Successfully tested by comparing with nextflow_conversion, yielding identical results (after sorting both) with a precision of 1e-10. A sample with 724,816 variants was used and the running time was 5 mins.

Report

nextflow_conversion - https://gistcdn.githack.com/hyunhwan-bcm/fab1c7755bce8806f06d158332c86286/raw/index.html
the current PR - https://gistcdn.githack.com/hyunhwan-bcm/c62d2ae1189f84f311962a9be3a5d10e/raw/index.html

Timeline

the current PR - https://gistcdn.githack.com/hyunhwan-bcm/c94b9027c045277b59fd5aa5c4dde7c9/raw/index.html

@hyunhwan-bcm
Copy link
Contributor Author

hyunhwan-bcm commented Aug 19, 2024

it got fixed

found some difference between nextflow_conversion in investigation

@@,Unnamed: 0,diffuse_Phrank_STRING,hgmdSymptomScore,...,hgmdSymptomSimScore,GERPpp_RS,gnomadAF,gnomadAFg,LRT_score,LRT_Omega,phyloP100way_vertebrate,gnomadGeneZscore,...,IMPACT,CADD_phred,CADD_PHRED,DANN_score,REVEL_score,fathmm_MKL_coding_score,conservationScoreGnomad,conservationScoreOELof,Polyphen2_HDIV_score,Polyphen2_HVAR_score,SIFT_score,zyg,FATHMM_score,M_CAP_score,MutationAssessor_score,ESP6500_AA_AF,...,CLASS,phrank,isB/LB,...,predict,min_ranking,max_ranking,...,identifier,origId,varId_dash,geneSymbol,geneEnsId,rsId,HGVSc,HGVSp,...,confidence (nd),ranking (nd),confidence level (nd),...,confidence (recessive),ranking (recessive),recessive var2,...,confidence level (nd recessive),ranking (nd recessive),nd recessive var2
→,1-100573634-A-T,0.8484286691385579→0.8985805389837482,0.0,...,0.386988942334944,-11.1→-10.9,0.0,0.0,0.0,0.0,-14.068→-4.393,1.0075,...,1.0,0.001,4.639,0.054416449317324→0.1128215789224949,0.002→0.007,6e-05→0.00064,2.0,1.0,0.0,0.0,1.0,1.0,9.66→7.75,0.000339,-3.395→-2.645,0.0,...,0.0,4.386657289640127→4.386657289640128,0,...,0.0,42→116,7562,...,1,1_100573634_A_T,1-100573634-A-T,SASS6,ENSG00000156876,-,ENST00000462159.1:n.1003-74T>A,-,...,0.0,566→577,Unsolved,...,-1.0,99999,NA,...,Unsolved,99999,NA
→,1-100573634-A-T,0.8484286691385579→0.8985805389837482,0.0,...,0.386988942334944,-11.1→-10.9,0.0,0.0,0.0,0.0,-14.068→-4.393,1.0075,...,1.0,0.001,4.639,0.054416449317324→0.1128215789224949,0.002→0.007,6e-05→0.00064,2.0,1.0,0.0,0.0,1.0,1.0,9.66→7.75,0.000339,-3.395→-2.645,0.0,...,0.0,4.386657289640127→4.386657289640128,0,...,0.0,42→116,7562,...,1,1_100573634_A_T,1-100573634-A-T,SASS6,ENSG00000156876,-,ENST00000535161.1:c.361-74T>A,-,...,0.0,566→577,Unsolved,...,-1.0,99999,NA,...,Unsolved,99999,NA
→,1-100661988-GAAAAAAA-GAA,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,1.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100661987_GAAAAAAA_GAA,1-100661988-GAAAAAAA-GAA,DBT,ENSG00000137992,rs752915898,ENST00000370132.4:c.1282-14_1282-10del,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,...,Unsolved,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T
→,1-100661988-GAAAAAAA-GAA,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,1.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100661987_GAAAAAAA_GAA,1-100661988-GAAAAAAA-GAA,DBT,ENSG00000137992,rs752915898,ENST00000370132.4:c.1282-16_1282-10del,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,...,Unsolved,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T
→,1-100661988-GAAAAAAA-GAA,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,1.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100661987_GAAAAAAA_GAA,1-100661988-GAAAAAAA-GAA,RP11-305E17.7,ENSG00000271415,rs752915898,-,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,...,Unsolved,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T
→,1-100661988-GAAAAAAA-GAA,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,1.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100661987_GAAAAAAA_GAA,1-100661988-GAAAAAAA-GAA,RP11-305E17.7,ENSG00000271415,rs752915898,-,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,...,Unsolved,10553→10554,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T
→,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,2.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100675881_TAAGAAGAAGAAGAAGAAGAAG_T,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,BRI3P1,ENSG00000225169,rs145600331,-,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100661988-GAAAAAAA-GAA,...,Unsolved,10553→10554,1-100661988-GAAAAAAA-GAA
→,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,2.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100675881_TAAGAAGAAGAAGAAGAAGAAG_T,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,DBT,ENSG00000137992,rs145600331,-,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100661988-GAAAAAAA-GAA,...,Unsolved,10553→10554,1-100661988-GAAAAAAA-GAA
→,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,0.8844450844162857→0.9380785846533636,0.0,...,0.386988942334944,2.0418650286041187→2.1999319018404906,0.0,0.0,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.9906,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,2.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,2.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,5.123814380349511,0,...,0.0,42→116,7562,...,1,1_100675881_TAAGAAGAAGAAGAAGAAGAAG_T,1-100675882-TAAGAAGAAGAAGAAGAAGAAG-T,DBT,ENSG00000137992,rs145600331,ENST00000370132.4:c.1017+348_1017+368del,-,...,0.0,7562,Unsolved,...,0.0,10553→10554,1-100661988-GAAAAAAA-GAA,...,Unsolved,10553→10554,1-100661988-GAAAAAAA-GAA
→,1-100733207-G-GT,0.2978688937686742→0.3069327298909689,0.0,...,0.318182103000133,2.0418650286041187→2.1999319018404906,0.0680509,0.0680509,0.1248318352234823→0.1847117112676056,1.6412083288859245→0.533333323943662,2.208126482213439→2.623349693251533,0.30262,...,1.0,16.63697120271033→15.767208588957056,7.524767655746509→7.219155397390272,0.8366118269851937→0.8214278370350678,0.2322019964768056→0.2349480519480519,0.509853043478261→0.4692662576687116,1.0,1.0,0.459→0.4185,0.173→0.188,0.096→0.079,1.0,0.79→0.74,0.061870585975024→0.0311653737373737,1.445→1.455,0.0,...,0.0,0.0,0,...,0.0,42→116,7562,...,1,1_100733207_G_GT,1-100733207-G-GT,RP11-305E17.6,ENSG00000224616,rs60634058,-,-,...,0.0,7562,Unsolved,...,-1.0,99999,NA,...,Unsolved,99999,NA

@hyunhwan-bcm hyunhwan-bcm changed the title Revert "Revert "job separation by chromosome"" job separation by chromosome Aug 20, 2024
@hyunhwan-bcm hyunhwan-bcm requested a review from jylee-bcm August 20, 2024 06:32
@hyunhwan-bcm hyunhwan-bcm self-assigned this Aug 20, 2024
@hyunhwan-bcm hyunhwan-bcm added the enhancement New feature or request label Aug 20, 2024
@hyunhwan-bcm hyunhwan-bcm added this to the v1.0 milestone Aug 20, 2024
@jylee-bcm jylee-bcm marked this pull request as ready for review August 20, 2024 14:28
Copy link
Contributor

@jylee-bcm jylee-bcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one comment!

It's great that it's now properly parallelized!

@hyunhwan-bcm hyunhwan-bcm merged commit 8473ba7 into nextflow_conversion Aug 20, 2024
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants