why snps are more common in non coding regions than coding? (1 Viewer)

dasfas

Well-Known Member
Joined
Dec 17, 2019
Messages
469
Gender
Male
HSC
2019
Yeah exactly what Hiva said.

If you want to go into even more detail, you can say that typically, coding portions of the gene are highly conserved. This is because a nucleotide change will likely have a more obvious effect on the protein due to an amino acid substitution, or some other change to the polypeptide chain. This means that mutations to coding portions are less tolerated, and are less likely to be passed on.

Mutations in introns tend to be less important because they are typically spliced out during RNA processing. Thus they are more tolerated as they tend to have a lower effect on the organism. Thus they are more likely to get passed on.

So ultimately, mutations in non-coding regions tend to have less of an overall effect on the organism, and hence why they are more tolerated. They are more likely to get passed on, and are therefore more common within a population.


An interesting metric (that is outside of the HSC scope but would be covered if you ever work in bioinformatics) is measuring the pLoF of a gene. pLoF means the probability of loss of function, which can tell you how tolerant a gene is to mutations. Different genes have different tolerances. For example, a gene that regulates critical function of cells may be very intolerant, whereas a gene that regulates some less important function may be more tolerant to mutations.

Also, different parts of the intron have different tolerances to mutations - for example the splice sites (donor and acceptor) are extremely intolerant to mutations and will always result in mis-splicing if disrupted. This is because you can think of them as the guide for splicing machinery to excise out introns and paste the exons together.

Finally, you can definitely have intronic mutations leading to disease. One that comes to mind is a hexanucleotide repeat expansion in a gene known as C9orf72. This typically leads to problems in neurons, which manifests itself clinically as ALS (the disease Stephen Hawkings suffered from).
 

specificagent1

Well-Known Member
Joined
Aug 24, 2021
Messages
1,977
Gender
Male
HSC
2021
Coding regions of DNA are sections of DNA that code for proteins. Even slight changes to the DNA sequence of this region can completely change the protein produced, which can have detrimental effects on the organism. This results in the organism not surviving to be able to pass on its genetic material containing the SNP which is why SNPs are less commonly found in coding regions of DNA

In contrast, certain sections of non-coding DNA are considered junk DNA and have no effect on protein production. If a SNP occurs in this region, it won't affect the organism's survivability and as a result will pass on its SNP to offspring. This results in SNPs being found in non-coding regions of DNA more than coding regions of DNA.
Would non coding dna making up the large majority of the dna also contribute to this? because only a small amount is actually coding dna
 

specificagent1

Well-Known Member
Joined
Aug 24, 2021
Messages
1,977
Gender
Male
HSC
2021
No those differences are accounted for. Let's look at this mathematically
Let's say I have 10 coding segments of DNA and 100 non-coding segments of DNA

Coding section: 2/10 have SNPs = 20%
Non-coding section: 50/100 have SNPs = 50%, so SNPs more commonly found in non coding regions

They measure it based on the percentage of SNPs found within an region. If i were to measure it based on amounts of DNA alone then i would be incorrect. Take a look at this

(This case isn't true, as more non coding DNA is present than coding DNA, but I'm just showing an example of why we don't use amounts of DNA)
Non-Coding section: 7/10 = 70%, so in this hypothetical scenario SNPs occur more in the non coding region as shown by the 70%. However, if we were to base it off amounts of SNPs occurring in amounts of DNA. Then the coding DNA has more SNPs, as it has 50 snps compared to non coding's 7 snps, but the SNPs are more commonly found in the coding region
coding section: 50/100 = 50%


So measuring how common SNPs are in DNA has nothing to do with whether more non-coding is DNA is present, as they measure it on the percentage in which it's found.
oh ok makes sense. So they measure what the percentage of snps are in each section. But given that there are more non coding dna, doesnt it mean that there is a greater probability of snp occuring in the non coding dna. If i have 10 cookies but 7 are chocolate and 3 are mint, then i have more of a chance of breaking the chocolate cookies when i transfer them to their package? is there not more chances of snp occuring in non coding?
 

Users Who Are Viewing This Thread (Users: 0, Guests: 1)

Top