FamLink home page


Home       Download       Getting started       Manual       Contact       Help       Databases       Genotype likelihoods       Validation


Databases

FamLink2 defines genetic markers through two input files, namely .map and .freq files. Note, all files are tab-separated. The former essentially describes a genetic map and has the format (header not needed)

[Chromosome]  [Marker id]  [Genetic position, in cM]

1                        rs1234          2.3

These values can be obtained from various sources, for instance deCODE map (female/male and sex-average available as supplementary files) published in 2019 or Rutger’s repository. In addition, Illumina provide a genetic map for the GSA chip.

Once imported into FamLink2 the genetic markers are defined. Next, frequencies for different populations are imported using .freq files. The format is

M    rs1234

A    C    0.2

A    G   0.8

Where M and A are to define where a markers starts and where the alleles start respectively. Frequencies are either constructed inhouse or extracted from the 1000G data repository. Let us know if assistance is needed with this. We suggest using bedtools intersect in a Linux environment to extract a list of SNPs (for instance) represented as a bed file with the vcf files downloaded at http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ . Frequencies for the various populations can subsequently be extracted in R from the resulting intersect files.

Below we provide some example files for commonly used MPS/NGS panels.

Panel

Reference

#Markers

Map file

Freq files

FamLink2 project

Comment

FORCE

Tillmar et al.

4,360

FORCE_genetic_map.map

FORCE_freq_files.zip

TBA

Freq files for seven populations, many thanks to Chris Phillips and María de la Puente for preparing those files

KIntelligence

Snedecor et al.

9,930

kintelligence_genetic_map.map

TBA

TBA

Removes some 300 SNPs from the published vcf

MPSplex (ICMP panel)

Phillips et al.

1,241

mpsplex_genetic_map.map

mpsplex_freqs.zip

mpsplex_databases.sav

Extracted from the 1000G Phase3 datasets for the superpopulations.

25K

Gorden et al.

17,232

25k_genetic_map.map

25k_freqs.zip

25k_databases.sav

Extracted from the 1000G Phase3 datasets for the superpopulations. Note, a pruning was performed in PLINK with r2=0.2 as a threshold with a sliding window approach. Original #Markers was 25,000. We note that FamLink2 can be slow in several dialogs for this number of markers (version 2.4).

95K

Gorden et al.

53,597

95k_genetic_map.map

95k_freqs.zip

TBA

Extracted from the 1000G Phase3 datasets for the superpopulations. Note, a pruning was performed in PLINK with r2=0.2 as a threshold with a sliding window approach. Original #Markers was 95,000. We note that FamLink2 can be slow in several dialogs for this number of markers (version 2.4).

*TBA=To be added




You may send comments to daniel.kling@rmv.se