> smalt map -o mapped.sam hs38_k14s8 mates_1.fastq mates_2.fastq Two files hs38_k14s8.smi and hs38_k14s8.sma are written to disk.
Words of 14 base pair length are sampled at every 8th position in the genome. > smalt index -k 14 -s 8 hs38_k14s8 GRCh38.fastaīuilds a hash index for the human genome in the FASTA file GRCh38.fasta. Then the sequencing reads are mapped onto the reference using the index. Mapping with SMALT involves two steps: First, a hash index has to be generated for the genomic reference sequences. The user can adjust the trade-off between sensitivity and speed by tuning the length and spacing of the hashed words. The best gapped alignments of each read are reported including a score for the reliability of the best mapping. For each sequencing read, potentially matching segments in the reference genome are identified from seed matches in the index and subsequently aligned with the read using dynamic programming. SMALT employs a hash index of short words up to 20 nucleotides long and sampled at equidistant steps along the reference genome.