Home Download Getting started Manual
Contact
Help
Databases Genotype likelihoods
FamLink2 defines genetic markers through two input files, namely .map
and .freq files. Note, all files are tab-separated.
The former essentially describes a genetic map and has the format (header not
needed)
[Chromosome] [Marker id] [Genetic position, in cM]
1
rs1234 2.3
These values can be obtained from various sources, for instance deCODE
map (female/male and sex-average available as supplementary files)
published in 2019 or Rutger’s repository. In addition, Illumina provide a genetic
map for the GSA chip.
Once imported into FamLink2 the genetic markers are defined. Next,
frequencies for different populations are imported using .freq files. The
format is
M rs1234
A C
0.2
A G 0.8
Where M and A are to define where a markers
starts and where the alleles start respectively. Frequencies are either
constructed inhouse or extracted from the 1000G data
repository. Let us know if assistance is needed with this. We suggest using bedtools intersect in a Linux environment to
extract a list of SNPs (for instance) represented as a bed file with the vcf files downloaded at http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
. Frequencies for the various populations can subsequently be extracted in R
from the resulting intersect files.
Below we provide some example files for commonly used MPS/NGS panels.
Panel |
Reference |
#Markers |
Map file |
Freq files |
FamLink2 project |
Comment |
FORCE |
4,360 |
TBA |
Freq files for seven populations, many
thanks to Chris Phillips and María de la Puente for preparing those files |
|||
KIntelligence |
9,930 |
TBA |
TBA |
Removes
some 300 SNPs from the published
vcf |
||
MPSplex (ICMP panel) |
1,241 |
Extracted from the 1000G Phase3 datasets for the superpopulations |
||||
25K |
17,232 |
Extracted from the
1000G Phase3 datasets for the superpopulations. Note,
a pruning was performed in PLINK with r2=0.2 as a threshold
with a sliding window
approach. Original #Markers was 25,000. We note that FamLink2 can be slow in several dialogs
for this number of markers (version 2.4). |
||||
95K |
53,597 |
TBA |
Extracted from the 1000G Phase3 datasets for the superpopulations. Note, a pruning was performed in PLINK with
r2=0.2 as a threshold with a sliding
window approach. Original #Markers was 95,000. We note that FamLink2 can be slow
in several dialogs for this number
of markers (version 2.4). |
*TBA=To be added
You may send comments to daniel.kling@rmv.se