Viewing DNA.Land VCF Files

Your DNA.Land VCF files lists the imputed results of 39 million genetic variants across your genome. This file is huge and cannot be observed using standard tools such as Microsoft Excel.

This tutorial will guide you how to view this file on your Windows or Mac computer, search for specific variations, and explain the VCF format

Two methods are shown:

  1. Using DNA.Land Compass - a website designed for easy VCF exploration.
    This method does not require any software installation.
    DNA.Land Compass requires downloading two files: .vcf.gz and .tbi files - both are available from DNA.Land.
  2. Using glogg - a special program which allows viewing large files on your computer.
    This method requires installing the glogg program and is meant for advanced users (installation instructions below).
    Glogg requires one file: .vcf.gz which is available from DNA.Land.


Back
  • DNA.Land is a research project - the data is presented for self-curiosity and education purposes and should not be used for clinical diagnosis.
  • The information contained in the download VCF is the result of an Imputation process. It is noisy and probabilistic.
    Visit the Imputation page to learn why.
  1. On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed VCF file:
  2. The downloads screen will be displayed:
  3. Right-Click on the vcf.gz to display the options menu, and select Save target as... option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Windows.
  4. Save the file in the Downloads folder as a .vcf.gz file:
  5. The downloaded file should look like so:
  6. An additional file is required by DNA.Land Compass - the .tbi file.
    On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed TBI file:
  7. The downloads screen for the TBI file will be displayed:
  8. Right-Click on the vcf.gz.tbi to display the options menu, and select Save target as... option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Windows.
  9. Save the file in the Downloads folder as a .gz.tbi file:
  10. The downloaded file should look like so:
  11. With the two files downloaded on your computer, visit the DNA.Land Compass Website to upload the files and view them.
    To learn more, see the DNA.Land Compass guide.
  12. Read below to learn more about the meaning of the values in DNA.Land's VCF file.
  1. On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed VCF file:
  2. The downloads screen will be displayed:
  3. Hold CONTROL and click on the vcf.gz to display the options menu, and select Download Linked File option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Mac OS X.
  4. The file will be downloaded and stored in the Downloads folder:
  5. Getting the Imputed TBI:

  6. On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed TBI file:
  7. The downloads screen will be displayed:
  8. Hold CONTROL and click on the vcf.gz.tbi to display the options menu, and select Download Linked File option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Mac OS X.
  9. The file will be downloaded and stored in the Downloads folder:
  10. With the two files downloaded on your computer, visit the DNA.Land Compass Website to upload the files and view them.
    To learn more, see the DNA.Land Compass guide.
  11. Read below to learn more about the meaning of the values in DNA.Land's VCF file.
  1. On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed VCF file:
  2. The downloads screen will be displayed:
  3. Right-Click on the vcf.gz to display the options menu, and select Save target as... option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Windows.
  4. Save the file in the Downloads folder as a .gz file:
  5. The downloaded file should look like so:
  6. To decompress the file, right-click on the file, and select Extract Here from the 7-Zip menu.
    7-zip (or other similar decompression programs) must be installed to decompress the file.
  7. Decompressing the file might take few minutes (depending on the computer's speed):
  8. When decompression is complete, the downloads folder should contain the two files:
  9. To view such a large file, a special program is needed.
    Here we show how to download and use a program called glogg.
    Download glogg for Windows from the Glogg's Download page, and run the installation program.
  10. After installation is complete, open the Start Menu, and select the Glogg program:
  11. Glogg's initial screen is empty. Click the yellow Open button or select Open from the File menu.
  12. In the Open Dialog, navigate to the Downloads folder and select the .vcf file.
  13. glogg will load the VCF file (might take few minutes, depending on the computer's speed):
  14. Once the file is loaded, its content can be searched using the search box at the bottom of the screen. In the example below we are searching for the SNP identified as rs17822931 (this is the Wet Earwax Gene). Depending on the computer's speed, the search might take few minutes.
  15. If the SNP is found (technically: if the text we've searched for was found in the file), the matching lines will be shown in the search results window at the bottom:
  16. Read below to learn more about the meaning of the values in DNA.Land's VCF file.
  1. On the DNA.Land main page, scroll down to the My Files section, and click on the Imputed VCF file:
  2. The downloads screen will be displayed:
  3. Hold CONTROL and click on the vcf.gz to display the options menu, and select Download Linked File option.
    NOTE: it is recommended to use this method instead of just clicking on the file to avoid incorrect automatic handling of the file by Mac OS X.
  4. The file will be downloaded and stored in the Downloads folder:
  5. When downloading is completed, double-Click on the file to decompress it.
  6. Decompressing the file will take few mintues (depending on your computer's speed):
  7. When decompression is complete, a second file with .vcf will appear in the downloads folder.
    NOTE: vcf files are sometimes associated with the Contacts program as vCards File format - this VCF is not such a vCard contacts file, and should not be opened in Contacts.
  8. To view such a large file, a special program is needed.
    Here we show how to download and use a program called glogg.
    Download and save the glogg.dmg file in the Downloads folder, double-click on the file to open it:
  9. The glogg application file will be displayed in a new window:
  10. Simply running the application will be blocked by Mac OS X, with the following warning:
  11. Run enable the application, hold control and click the application icon, then select Open from the menu:
  12. A new warning message will appear, select Open to start the application:
  13. glogg's window will appear. Click on the Yellow Open icon, or select Open from the File menu:
  14. In the Open Dialog, navigate to the Downloads folder and select the .vcf file:
  15. glogg will load the VCF file (might take few minutes, depending on the computer's speed):
  16. Once the file is loaded, its content can be searched using the search box at the bottom of the screen. In the example below we are searching for the SNP identified as rs17822931 (this is the Wet Earwax Gene). Depending on the computer's speed, the search might take few minutes.
  17. If the SNP is found (technically: if the text we've searched for was found in the file), the matching lines will be shown in the search results window at the bottom:
  18. Read below to learn more about the meaning of the values in DNA.Land's VCF file.
Each line in a DNA.Land VCF file contains 10 fields:
Field # 1 2 3 4 5 6 7 8 9 10
Field Name chr pos id ref alt qual filter info format data
Example Value 7 151626756 rs2374298 T C 0 PASS NS=1 GT:GL 0/0:-0.22,-0.40,-2.70
Below is the meaning of each field:
# Field Name Example Value Comments
1 Chromosome (chr) 7 The chromosome in which the variant is located
2 Position (pos) 151626756 The genomic position of the variant in the Human Reference Genome version GRCh37 (aka hg19 ).
3 Identifier (id) rs2374298 global variant identifier. Can be searched on SNPedia, dbSNP and other resources.
4 Reference base (ref) T The nucleotide found in the Human Reference Genome version GRCh37 (aka hg19 ).
5 Alternate base (alt) C The alternate nucleotide (when the genotype differs from the reference nucleotide one)
6 Quality (qual) 0 Not used in DNA.Land's VCF
7 Filter PASS Not used in DNA.Land's VCF
8 Info NS=1 Number of Samples. Always 1 in DNA.Land VCF files.
9 Format GT:GL GenoType and Genotype-Likelihood markers.
(see below for further interpretation)
10 Data 0/0:-0.22,-0.40,-2.70 Genotype and Genotype-Likelihood values of the variant for the imputed sample
(see below for further interpretation)


How to interpret INFO and DATA values in the VCF file?
The INFO and DATA fields contain corresponding values, separated by colons:



interpretting REF/ALT and GT/GL values in the VCF file:
Genotype Example Values from VCF file
GT Code Meaning Description Abbrev. REF ALT GT deduced
genotype
GL
(Genotype Likelihood)
comments
0/0 REF/REF Homozygous reference hom_ref T C 0/0 TT -0.22 This genotype has the highest likelihood, thus it is the chosen genotype that will apear in the VCF file (GT=0/0).
0/1 or 1/0 REF/ALT Heterozygous het T C 0/1 or 1/0 TC or CT -0.40 in DNA.Land's VCF file, there is
no way to differentiate between
TC and CT, they are treated the same.
1/1 ALT/ALT Homozygous Alternate hom_alt T C 1/1 CC -2.70