Calculates the predicted score of a given TR sequence (1 = perfect TR and 0 = crappy TR). If features=2, returns the score according to each feature (better to have both high).
Type
Default
Details
TR_seq
str
A string of the TR DNA sequence
features
int
1
The classifier model, no need to specify it (one feature by default). If two: uses the two features model
TR_bad='TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGG'print('TR bad score =',score(TR_bad))TR_good='AAATGATCGCCAAATCTGAACAGGAAATTGGCAAAGCAACCGCTAAATACTTTTTCTACTCAAACATTAT'print('TR good score =',score(TR_good))
Calculates the score for every TR in the list and returns them in a dataframe format. If features=2, returns the score according to each feature (better to have both high).
Type
Default
Details
TR_seq_list
list
A list of strings of TRs DNA sequences
TR_name_list
list
A list of strings of TRs names
features
int
1
The number of features to use
TR_bad=['TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGG','AAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGGACAAAGGTCGTGATTTCGCTA','GGTTTCTCTAAGGAGTCCATTCTGCCGAAGCGCAACTCCGACAAGCTGATCGCGCGTAAGAAGGACTGGG','CAAGCTGATCGCGCGTAAGAAGGACTGGGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCG','ACCCGATTGACTTCCTCGAGGCGAAGGGGTACAAGGAGGTGAAGAAGGATCTGATTATCAAGCTGCCGAA','AGTACTCCCTGTTCGAGCTGGAGAATGGTCGTAAGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGG','CAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTTCTAAGCGCGTGATTCTGGCGG','ACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAAGCCGATCCGTGAGCAGGCGGA', ]score_list(TR_bad,['TR_bad_'+str(k) for k inrange (1,9)])
TR_Name
TR_Seq
TR_Score
0
TR_bad_1
TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACG...
0.23
1
TR_bad_2
AAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGT...
0.05
2
TR_bad_3
GGTTTCTCTAAGGAGTCCATTCTGCCGAAGCGCAACTCCGACAAGC...
0.00
3
TR_bad_4
CAAGCTGATCGCGCGTAAGAAGGACTGGGATCCGAAGAAGTACGGT...
0.00
4
TR_bad_5
ACCCGATTGACTTCCTCGAGGCGAAGGGGTACAAGGAGGTGAAGAA...
0.01
5
TR_bad_6
AGTACTCCCTGTTCGAGCTGGAGAATGGTCGTAAGCGTATGCTGGC...
0.08
6
TR_bad_7
CAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGT...
0.06
7
TR_bad_8
ACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGA...
0.12
TR_good=['AAATGATCGCCAAATCTGAACAGGAAATTGGCAAAGCAACCGCTAAATACTTTTTCTACTCAAACATTAT','TCAAACATTATGAATTTCTTCAAAACCGAAATCACCTTAGCGAATGGCGAAATTCGTAAACGCCCTCTGA','ATGCCTCAAGTAAACATCGTTAAAAAGACTGAGGTGCAGACTGGCGGTTTCTCTAAGGAGTCCATTCTGC','GGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCGTACTCTGTTCTGGTGGTCGCCAAGGTC','AGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAA','GCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAACTTCCTGTACCTGGCCTCGCACTACGAG','CAGAAGCAGCTGTTCGTGGAGCAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTT','CTAAGCGCGTGATTCTGGCGGACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAA' ]score_list(TR_good,['TR_good_'+str(k) for k inrange (1,9)])
TR_Name
TR_Seq
TR_Score
0
TR_good_1
AAATGATCGCCAAATCTGAACAGGAAATTGGCAAAGCAACCGCTAA...
0.84
1
TR_good_2
TCAAACATTATGAATTTCTTCAAAACCGAAATCACCTTAGCGAATG...
0.82
2
TR_good_3
ATGCCTCAAGTAAACATCGTTAAAAAGACTGAGGTGCAGACTGGCG...
0.76
3
TR_good_4
GGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCGTAC...
0.74
4
TR_good_5
AGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGGGAACGAGTT...
0.83
5
TR_good_6
GCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAACTTC...
0.55
6
TR_good_7
CAGAAGCAGCTGTTCGTGGAGCAGCACAAGCACTACCTGGACGAGA...
0.81
7
TR_good_8
CTAAGCGCGTGATTCTGGCGGACGCGAATCTGGATAAGGTCCTGTC...
0.81
TR_bad=['TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGG','AAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGGACAAAGGTCGTGATTTCGCTA','GGTTTCTCTAAGGAGTCCATTCTGCCGAAGCGCAACTCCGACAAGCTGATCGCGCGTAAGAAGGACTGGG','CAAGCTGATCGCGCGTAAGAAGGACTGGGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCG','ACCCGATTGACTTCCTCGAGGCGAAGGGGTACAAGGAGGTGAAGAAGGATCTGATTATCAAGCTGCCGAA','AGTACTCCCTGTTCGAGCTGGAGAATGGTCGTAAGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGG','CAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTTCTAAGCGCGTGATTCTGGCGG','ACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAAGCCGATCCGTGAGCAGGCGGA', ]score_list(TR_bad,['TR_bad_'+str(k) for k inrange (1,9)],2)
TR_Name
TR_Seq
TR_Score_Sp
TR_Score_Avd
0
TR_bad_1
TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACG...
0.23
0.63
1
TR_bad_2
AAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGT...
0.05
0.42
2
TR_bad_3
GGTTTCTCTAAGGAGTCCATTCTGCCGAAGCGCAACTCCGACAAGC...
0.00
0.30
3
TR_bad_4
CAAGCTGATCGCGCGTAAGAAGGACTGGGATCCGAAGAAGTACGGT...
0.00
0.51
4
TR_bad_5
ACCCGATTGACTTCCTCGAGGCGAAGGGGTACAAGGAGGTGAAGAA...
0.01
0.59
5
TR_bad_6
AGTACTCCCTGTTCGAGCTGGAGAATGGTCGTAAGCGTATGCTGGC...
0.08
0.54
6
TR_bad_7
CAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGT...
0.06
0.29
7
TR_bad_8
ACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGA...
0.12
0.04
TR_good=['AAATGATCGCCAAATCTGAACAGGAAATTGGCAAAGCAACCGCTAAATACTTTTTCTACTCAAACATTAT','TCAAACATTATGAATTTCTTCAAAACCGAAATCACCTTAGCGAATGGCGAAATTCGTAAACGCCCTCTGA','ATGCCTCAAGTAAACATCGTTAAAAAGACTGAGGTGCAGACTGGCGGTTTCTCTAAGGAGTCCATTCTGC','GGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCGTACTCTGTTCTGGTGGTCGCCAAGGTC','AGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAA','GCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAACTTCCTGTACCTGGCCTCGCACTACGAG','CAGAAGCAGCTGTTCGTGGAGCAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTT','CTAAGCGCGTGATTCTGGCGGACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAA' ]score_list(TR_good,['TR_good_'+str(k) for k inrange (1,9)],2)
Calculates the predicted DGR mutagenesis percentage for every TR in the list and returns them in a dataframe format
Type
Details
TR_seq_list
list
A list of strings of TRs DNA sequences
TR_name_list
list
A list of strings of TRs names
TR_bad=['TTAGCGAATGGCGAAATTCGTAAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGG','AAACGCCCTCTGATCGAAACCAACGGCGAAACGGGTGAGATCGTGTGGGACAAAGGTCGTGATTTCGCTA','GGTTTCTCTAAGGAGTCCATTCTGCCGAAGCGCAACTCCGACAAGCTGATCGCGCGTAAGAAGGACTGGG','CAAGCTGATCGCGCGTAAGAAGGACTGGGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCG','ACCCGATTGACTTCCTCGAGGCGAAGGGGTACAAGGAGGTGAAGAAGGATCTGATTATCAAGCTGCCGAA','AGTACTCCCTGTTCGAGCTGGAGAATGGTCGTAAGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGG','CAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTTCTAAGCGCGTGATTCTGGCGG','ACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAAGCCGATCCGTGAGCAGGCGGA', ]DGR_percentage_list(TR_bad,['TR_bad_'+str(k) for k inrange (1,9)])
TR_Name
TR_Seq
0
TR_bad_1
0.427049
1
TR_bad_2
0.110769
2
TR_bad_3
0.019986
3
TR_bad_4
0.025612
4
TR_bad_5
0.045752
5
TR_bad_6
0.172833
6
TR_bad_7
0.092111
7
TR_bad_8
0.039965
TR_good=['AAATGATCGCCAAATCTGAACAGGAAATTGGCAAAGCAACCGCTAAATACTTTTTCTACTCAAACATTAT','TCAAACATTATGAATTTCTTCAAAACCGAAATCACCTTAGCGAATGGCGAAATTCGTAAACGCCCTCTGA','ATGCCTCAAGTAAACATCGTTAAAAAGACTGAGGTGCAGACTGGCGGTTTCTCTAAGGAGTCCATTCTGC','GGATCCGAAGAAGTACGGTGGCTTCGATTCTCCGACCGTGGCGTACTCTGTTCTGGTGGTCGCCAAGGTC','AGCGTATGCTGGCGTCTGCGGGTGAGCTGCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAA','GCAGAAGGGGAACGAGTTGGCCCTTCCGTCCAAGTACGTGAACTTCCTGTACCTGGCCTCGCACTACGAG','CAGAAGCAGCTGTTCGTGGAGCAGCACAAGCACTACCTGGACGAGATTATTGAGCAGATTTCTGAGTTTT','CTAAGCGCGTGATTCTGGCGGACGCGAATCTGGATAAGGTCCTGTCTGCCTACAATAAGCACCGTGATAA' ]DGR_percentage_list(TR_good,['TR_good_'+str(k) for k inrange (1,9)])