Files present in the Analysis directory:

.
├── Analysis
   ├── all_consensus_assembly.fa          		<- consensus sequence for each tag (read1 and read 2 assembly merged where possible) 
   ├── all_consensus_assembly_G10.bam     		<- alignment for each sample against all_consensus_assembly.fa 
   ├── all_consensus_assembly_G1.bam
   ├── all_consensus_assembly_G2.bam
   ├── all_consensus_assembly_G3.bam
   ├── all_consensus_assembly_G5.bam
   ├── all_consensus_assembly_G6.bam
   ├── all_consensus_assembly_G7.bam
   ├── all_consensus_assembly_G8.bam
   ├── all_consensus_assembly_G9.bam
   ├── all_consensus_assembly_P1.bam
   ├── all_consensus_assembly_P2.bam
   ├── all_consensus_assembly_PR1-1.bam
   ├── all_consensus_assembly_PR1-2.bam
   ├── all_consensus_assembly_PR1-3.bam
   ├── all_consensus_assembly_PR2-1.bam
   ├── all_consensus_assembly_PR2-2.bam
   ├── all_consensus_assembly_PR2-3.bam
   ├── all_consensus_assembly_TestSample-G4.bam
   ├── all_consensus.bam     				<- alignment all samples against all_consensus_assembly.fa
   ├── all_consensus_samtools1_snps_files.vcf.gz	<- raw variants called with samtools (bgzipped)
   ├── all_consensus_samtools1_snps_files.vcf.gz.tbi	<- index file for raw variants
   ├── all_consensus_snp_P1_P2_informative.vcf  	<- snps where the parents have 2 alleles
   ├── all_consensus_snp_PvsG_informative.vcf     	<- snps that differ between pedigree and population samples
   ├── all_consensus_var_P1_P2_informative.vcf     	<- variants where the parents have 2 alleles 
   ├── all_consensus_var_PvsG_informative.vcf     	<- variants that differ between pedigree and population samples 
   ├── all_summary_stat.txt     			<- summary statistics for all tags and samples 
   └── Analysis.txt     				<- this document

0 directories, 29 files

##########

The analysis has been done de novo.

The samples have been analysed with our RAD-seq analysis pipeline:

1. clustering of read 1 into stacks using ustacks v1.30 (parameters -t fastq
-p 1 -m 2 -M 2 -N 4 -H)
2. calling consensus for read 1 stacks using cstacks v1.30
3. filter stacks to remove those supported by less than 3 samples
4. assembly of read 2 of each stack/RADtag using idba_ud (v1.09)
5. merging read 2 assembled contigs with read 1 where possible
6. mapping of all reads of all samples to the assembled contigs using SMALT
(release 0.6.2)
7. calling raw SNPs for each tag using samtools mpileup (v. 1.2_64bit) and
bcftools call (v. 1.2)

##########
Selection of informative variants and snps in the predegree
Criteria: 
1. variants are present in both parents
2. genotype quality GQ > 20
3. read depth DP > 5
4. with genotypes allowed (P1/P2): ab/ab, aa/ab, bb/ab, ab/aa, ab/bb 
5. not INDEL <- SNPs only

##########

Selection of variants and snps differing between pedegree and population
samples
Criteria:
1. variants are present in both parents
2. variants are present in at least 2 population samples
3. genotype quality GQ > 20
4. read depth DP > 5
5. genotypes allowed (ped/pop): hom/homVar, homVar/hom
6. not INDEL <- SNPs only