microhapdb marker

microhapdb marker

Retrieve marker records by identifier or query

usage: microhapdb marker [-h] [--ae-pop POP] [--panel FILE] [--region RGN]
                         [--query QRY] [--format {table,detail,fasta,offsets}]
                         [--columns C] [--delta D] [--min-length L]
                         [--extend-mode E] [--notrunc]
                         [id ...]

Required Arguments

id

one or more marker identifiers

Data Retrieval

Configure how marker records are retrieved from the database.

--ae-pop

specify the 1000 Genomes population from which to report effective number of alleles in the “Ae” column; by default, the Ae value averaged over all 26 1KGP populations is reported

--panel

file containing a list of marker names/identifiers, one per line

--region

restrict results to the specified genomic region; format chrX:YYYY-ZZZZZ

--query

Retrieve records using a Pandas-style query

Formatting

Configure how results are formatted. Some formats include information for a ‘target sequence’ for each marker, representing what would be targeted by e.g. hybridization capture probes or PCR primers for amplicon sequencing. MicroHapDB computes the endpoints of these target sequences by extending –delta=D nucleotides beyond the first and last SNPs defining the marker, and then—if needed—extending further until –min-length=L is satisfied. Configuration of these and related parameters is described below.

--format

Possible choices: table, detail, fasta, offsets

--columns

string of column codes indicating which fields to include in tabular output; n=NumVars x=Extent c=Chrom s=Start e=End p=Positions q=Positions37 r=RSIDs a=Ae; by default C=nxcsea

--delta

extend D nucleotides beyond the marker extent when computing target sequence boundaries; by default D=10

--min-length

minimum length of the target sequence; by default L=80

--extend-mode

specify how the target sequence will be extended to satisfy the minimum length criterion; use 5 to extend only the 5’ end, 3 to extend only the 3’ end, or symmetric to extend both ends equally; by default, symmetric mode is used

--notrunc

disable truncation of tabular results

Examples::

microhapdb marker mh01NK-001
microhapdb marker --format=fasta mh13KK-218 mh04CP-002 mh02AT-05
microhapdb marker --format=fasta --panel mypanel.txt
microhapdb marker --format=detail --min-length=125 --extend-mode=3 MHDBM-dc55cd9e
microhapdb marker --region=chr18:1-25000000 --GRCh37
microhapdb marker --query='Source == "ALFRED"' --ae-pop CEU
microhapdb marker --query='Name.str.contains("PK")'