Home Download Getting started Manual
Contact
Help
Databases Genotype likelihoods
Validation
FamLink2 defines genetic markers through two input files, namely .map
and .freq files. Note, all files are tab-separated.
The former essentially describes a genetic map and has the format (header not
needed)
[Chromosome] [Marker id] [Genetic position, in cM]
1
rs1234 2.3
These values can be obtained from various sources, for instance deCODE
map (female/male and sex-average available as supplementary files)
published in 2019 or Rutger’s
repository. In addition, Illumina provide a genetic
map for the GSA chip.
Once imported into FamLink2 the genetic markers are defined. Next,
frequencies for different populations are imported using .freq files. The
format is
M rs1234
A C 0.2
A G 0.8
Where M and A are to define where a markers
starts and where the alleles start respectively. Frequencies are either
constructed inhouse or extracted from the 1000G data
repository. Let us know if assistance is needed with this. We suggest using bedtools intersect in a Linux environment to
extract a list of SNPs (for instance) represented as a bed file with the vcf files downloaded at http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
. Frequencies for the various populations can subsequently be extracted in R
from the resulting intersect files.
Below we provide some example files for commonly used MPS/NGS panels.
|
Panel |
Reference |
#Markers |
Map file |
Freq files |
FamLink2 project |
Comment |
|
FORCE |
4,360 |
Freq files for seven populations, many
thanks to Chris Phillips and María de la Puente for
preparing those files. Removed ancestry and Y/X SNPs. |
||||
|
KIntelligence |
9,930 |
Extracted from the 1000G Phase3 datasets for the superpopulations. Removes some 300 SNPs from the published
vcf without frequencies in 1000G. |
||||
|
MPSplex (ICMP panel) |
1,241 |
Extracted from the 1000G Phase3 datasets
for the superpopulations. |
||||
|
25K |
17,232 |
Extracted from the 1000G Phase3 datasets for the superpopulations. Note, a pruning was performed in PLINK
with r2=0.2 as a threshold with a sliding window approach. Original #Markers
was 25,000. We note that FamLink2 can be slow in several dialogs for this
number of markers (version 2.4). |
||||
|
95K |
53,597 |
TBA |
Extracted from the 1000G Phase3 datasets
for the superpopulations. Note, a pruning was
performed in PLINK with r2=0.2 as a threshold with a sliding window approach.
Original #Markers was 95,000. We note that FamLink2 can be slow in several
dialogs for this number of markers (version 2.4). |
*TBA=To be added
You may send comments to daniel.kling@rmv.se